Page History

...

Participant will use his/her own laptop for this part
- We will provide a VM with everything pre-installed in it through Nebula.
  - TODO: Rob will talk to Doug for this if we can spawn 50 VMs on Nebula for the tutorial session
  - TODO: Order 50+ flash drive for back up that will contain the VMs
  - TODO: Create a VM with everything installed in it and take a snapshot which will then be deployed within Nebula. Approx. time required - 2 days
  - TODO: Not sure of Jetstream yet.
- Provide clear instructions as how to access VMs in Nebula with proper credentials.
  - TODO: Clear Instruction of how to access the VMs (e.g., through ssh), for different OSes.
- (Before tutorial - wiki pages with clear instructions) Installs Python/R/MATLAB/cURL to use BD Service along with the library required in case any one interested in using the BD services in future.
  - TODO: Create wiki pages with clear instructions
Demonstration of use of BD Fiddle
- Sign up for Brown Dog Service
- Obtain a key/token using curl or Postman
- Use token and bd fiddle interface to obtain to see BD in action.
- Copy paste the python code snippet and use it the application to be explained next.
- TODO: Create a document for the demo with step-by-step screenshots
Create an applications using BD services
Three applications:
- Problem 1 : Given a collection of images with text embedded in it ~~(or scanned handwritten documents image)~~, try to search images based on its content.
  - One can upload images from local directory to obtain images . /use flicker API (may be just a sample snippet)or use external web service.
    - - TODO: Create an example dataset with images
      - TODO: Provide a code snippet of using externel service to obtain images. e.g. Flicker API.
        This will only provided as an example and will not used for the rest of the code.
  - Write a Python script to upload all images to BD (provide this code)
  - Let the participant use the python library of BD to obtain key/token and write request to BD-API
  - Make sure OCR ( or Census Extractor) is running
  - Once technical metadata is obtained from BD, index it tags and technical metadata in an locally running Elasticsearch (provide this code also)
  - Search for the image using ES query
  - Participant will require to install ES for this application.
- Problem 2 : Given a collection of text files from a survey or reviews for a book/movie, use sentiment analysis extractor to calculate the sentiment value for each file and group similar values together.
  - Write a Python script to upload all files from a directory to the BD API
  - Saves the results for each text file in a single file
  - Create separate folders and move the file based on the value
  - ~~Tried to see Yelp API, IMDB API to obtain reviews (???) or use Twitter API (??) to pull some reviews~~
- Problem 3: Use BD convert to convert a collection of images with old format to png or pdf. convert odp/odt to ppt/doc
- ~~CSV files (Ameriflux), use BD for some gap-filing on the files and return result.~~
- Think of a R client for BD. (PeCAn??)
- ~~Think of a MATLAB client~~
- ~~Want to build a Web application on top of BD, (Similar to what Marcus build??)~~
- Use BD convert to convert a collection of images with old format to png or pdf. convert odp/odt to ppt/doc

...

Page tree

Versions Compared

Old Version 29

New Version 30

Key