...
- Demonstration of use of BD Fiddle
- Sign up for Brown Dog Service
- Obtain a key/token using curl or Postman or use of IPython notebook
- Use token and bd fiddle interface to obtain to see BD in action.
- Copy paste the python code snippet and use it the application to be explained next.
- Create a document for the demo with step-by-step screenshots Fix for all above steps.
- Eugene Roeder Fix the CORS error for file url option (I think it is a known issue). Please add the JIRA issue number here.
- Delay Fix the delay experienced when file is uploaded from local directory to the bdfiddle ui
- Create an applications using BD services
Three applications:- Problem 1 : Given a collection of images with text embedded in it, try to search images based on its content. (Emphasizes on extraction on unstructured data, indexing and content-based retrieval)
- One can upload images from local directory to obtain images or use external web service.
- Create an example dataset with images with interesting query
- Provide a code snippet of using externel service to obtain images. e.g. Flicker API.
- This will only be provided as an example and will not be used for the rest of the code.
- Let the participant use the python library of BD to obtain key/token and submit request to BD-API gateway
- Provide the link to the current BD REST API and create a document/wiki page showing step-by-step screenshots of obtaining a key/token using python library.
- Write a Python script that will serve as a stub for the BD client
- The participant will fill in the code to BD REST API call to submit their requests.
- Make sure OCR and face extractor are running before starting the demo
- Python
- Make sure the Elasticsearch is started before the example files are submitted to BD service
- Provide Instructions to start Elasticsearch and start a webclient to it for visualization.
- Make sure the cluster name in the config.yml differs for each participant.
- Provide Instructions to start Elasticsearch and start a webclient to it for visualization.
- Once technical metadata is obtained from BD, index it tags and technical metadata in an locally running Elasticsearch.
- Write a python script that will index the technical metadata in ES
- Search for the image using ES query
- Provide ES query for search
- One can upload images from local directory to obtain images or use external web service.
- Problem 2 : Given a collection of text files from a survey or reviews for a book/movie, use sentiment analysis extractor to calculate the sentiment value for each file and group similar values together. (Emphasizes on extraction on unstructured data and useful analysis )
- A collection of text files with reviews
- Obtain an examples dataset from the web.
- Let the participant use the python library of BD to obtain key/token and submit request to BD-API gateway
- Provide the link to the current BD REST API and create a document/wiki page showing step-by-step screenshots of obtaining a key/token using python library.
- Write a Python script that will serve as a stub for the BD client
- The participant will fill in the code to BD REST API call to submit their requests.
- Make sure the Sentiment Analysis extractor is running
- Saves the results for each text file in a single file with corresponding values
- Provide code for this in stub script
- Create separate folders and move the file based on the sentiment value
- Provide a code that will do the above action in the stub
- (Optional) Index text files along with the sentiment values and use ES visualization tool to search for documents with sentiment value less than some number.
- A collection of text files with reviews
- Problem 3: Use BD conversion to convert a collection of images/ps/odp files to png/pdf/ppt. This will demonstrates that if you have a directory with files in old file formats, just use BD to get it all converted. (Emphasies on conversion)
- Provide a Python script for this and let Participant use python library to use the BD service
- Problem 4: Given a collection of *.xlsx files, obtain some results based on some columns value. (Emphasizes on extraction and analysis on scientific data)
- Problem 1 : Given a collection of images with text embedded in it, try to search images based on its content. (Emphasizes on extraction on unstructured data, indexing and content-based retrieval)
...