...
- Create applications using BD services (50 mins)
- Conversion Example (15 mins20mins): To convert a collection of images/ps/odp/audio/video files to png/pdf/ppt/mp3. This will demonstrates that if you have a directory with files in old file formats, just use BD to get it all converted without requiring to install any software. (Emphasizes on conversion)
- Make sure imagemagick and ffmpeg converters are running before the demo
- Obtain the BD token/key - ask participant to refer to previous bdfiddle step or use the python library
- Ask the participant to check for the available output formats for specific input formats
- Ask the participant to use python library to use BD service
- TODO:
- Inna Zharnitsky Provide a Python script for this and let participants use python library to use the BD service
- Smruti PadhyProvide a Step-by-step instructions with screenshot to do this
- (Optional) Provide R script for this problem.
Extraction/Indexing/Retrieval Example (20 mins):
Given a collection of images with text embedded in it, and audio files, search images/audio files based on its content. (
Emphasizes on extraction on unstructured data, indexing and content-based retrieval)
- Make sure OCR, face, and speech2text extractors are running before starting the demo
- One can upload images from local directory to obtain images or use external web service.
- Let the participant use the python library of BD to obtain key/token and submit extraction request to BD-API gateway
- Once technical metadata is obtained from BD, write the tags and technical metadata to a local file /python dictionary.
- Search the file based on the tags/technical metadata by linear search on the index file
- TODO
- Make sure the metadata to be posted as json-ld for the extractors
- OCR
- Face
- Eyes
- Closeup
- Smruti Padhy speech2text
- Create an example dataset with images and audios to which we can make interesting query
- (Optional) Provide a code snippet of using externel service to obtain images. e.g. Flicker API.
- This will only be provided as an example and will not be used for the rest of the code.
- Provide the link to the current BD REST API and create a document/wiki page showing step-by-step screenshots of obtaining a key/token using python library.
- Marcus SlavenasWrite a Python script that will serve as a stub for the BD client
- The participant will fill in the code to use python library to call BD REST API and submit their requests.
- The python script should write the tags and technical metadata to a local file. (Probably can use python library's index method that writes it as feature vectors.)
- Marcus SlavenasWrite a Python script to make interesting search/query to the index file. Again probably use the python library's find method or just read the local file.
- (Optional) provide R script for this problem
- Make sure the metadata to be posted as json-ld for the extractors
- Conversion & Extraction Example (15 10 min): Given an image file, convert it to a different format and obtains the face detection and OCR.
- TODO
- Write a Python script that does the conversion and then sends the converted file to the BD service.
- Create a step-by-step instructions document with screenshots.
- (Optional) Provide a R script for this problem
- TODO
- (Optional) Combination of Conversion & Extraction Example: Obtaining Ameriflux data and converting into *.clim format (similar to csv format but tab separated) for SNIPET model. Calculate average air temperature and its standard deviation. (This will emphasize both conversion and analysis)
- Conversion Example (15 mins20mins): To convert a collection of images/ps/odp/audio/video files to png/pdf/ppt/mp3. This will demonstrates that if you have a directory with files in old file formats, just use BD to get it all converted without requiring to install any software. (Emphasizes on conversion)
...
- Part 1: Teach to write an extractor (35 mins)
- Start with the bd-template extractor, which is the word count extractor.
- Ask participant to modify the extractor, which would use 'grep' to find a specific pattern within the file.
- Ask to change the name of the extractor from ncsa.wordcount to ncsa.grep.
- Include yes/no in the metadata if the pattern is found or not found.
- Briefly describe Json-ld support. Provide intuition behind the idea json-ld with a simple example. No need to go into details of RDF.
- TODO
- Smruti Padhy, Marcus Slavenas, Jong Lee Provide Step-by-step instructions/screenshots of updating the extractor and the output as seen at the Clowder GUI. Also provide link to json-ld for further readings. Provide minimum software requirements for the development such as Clowder, Rabbimq, MongoDB, pyclowder, python libraries, etc.
- Inna Zharnitsky Write an extractor that does grep along with the wordcount for demonstration purpose and include json-ld
- Sandeep Puthanveetil Satheesan Write an extractor that accepts csv file with say 3 columns (probably with values from weather or bacterial growth model (see Problem 2.2 below)) , calculate the average of a specific column
- Part 2: Teach to write a converter (35 mins)
- Start with the bd-template for converter- imagemagick
- Ask the participant to modify the converter input/output formats in the comment section. And see the result using the polyglot web UI for post and get
- Another example - FFmpeg converter for audio and video
- TODO
- Smruti Padhy, Marcus Slavenas, Jong Lee Provide step-by-step instructions/screenshots of modifying imagemagick and usage of polyglot GUI. Provide a default username/password
- Marcus Slavenas, Kenton McHenry Write a converter using FFmpeg
- Part 3: Teach to upload a converter or an extractor to the locally installed Tools Catalog. (20mins)
- Inna Zharnitsky Step-by-step procedure to upload an extractor or a converter, an input file and an output file without a docker file.
- Part 4 (Optional - For advanced user): Dockerize the tool
...