Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Participant will use his/her own laptop for this part
    • We will provide a VM with everything pre-installed in it through Nebula. 
      • TODO: Rob will talk to Doug for this if we can spawn 50 VMs on Nebula for the tutorial session
      • TODO: Order 50+ flash drive for back up that will contain the VMs
      • TODO: Create a VM with everything installed in it and take a snapshot which will then be deployed within Nebula. Approx. time required - 2 days
      • TODO: Not sure of Jetstream yet. 
    • Provide clear instructions as how to access VMs in Nebula with proper credentials.
      • TODO: Clear Instruction of how to access the VMs (e.g., through ssh), for different OSes.
    • (Before tutorial - wiki pages with clear instructions) Installs Python/R/MATLAB/cURL to use BD Service along with the library required in case any one interested in using the BD services in future.
      • TODO: Create wiki pages with clear instructions
  • Demonstration of use of BD Fiddle 
    • Sign up for Brown Dog Service
    • Obtain a key/token using curl or Postman
    • Use token and bd fiddle interface to obtain to see BD in action. 
    • Copy paste the python code snippet and use it the application to be explained next. 
    • TODO: Create a document for the demo with step-by-step screenshots

  • Create an applications using BD services
    Three applications:
    • Problem 1 : Given a collection of images with text embedded in it (or scanned handwritten documents image), try to search images based on its content.
      • One can upload images from local directory to obtain images .  /use flicker API (may be just a sample snippet)or use external web service.  
          • TODO: Create an example dataset with images
          • TODO: Provide a code snippet of using externel service to obtain images. e.g. Flicker API.
            •  This will only provided as an example and will not used for the rest of the code.
      • Write a Python script to upload all images to BD (provide this code)
      • Let the participant use the python library of BD to obtain key/token and write request to BD-API
      • Make sure OCR ( or Census Extractor) is running
      • Once technical metadata is obtained from BD, index it tags and technical metadata in an locally running Elasticsearch (provide this code also)
      • Search for the image using ES query
      • Participant will require to install ES for this application.

    • Problem 2 : Given a collection of text files from a survey or reviews for a book/movie, use sentiment analysis extractor to calculate the sentiment value for each file and group similar values together.
      • Write a Python script to upload all files from a directory to the BD API
      • Saves the results for each text file in a single file
      • Create  separate folders and move the file based on the value
      • Tried to see Yelp API, IMDB API to obtain reviews (???)  or use Twitter API (??) to pull some reviews
    • Problem 3: Use BD convert to convert a collection of images with old format to png or pdf. convert odp/odt to ppt/doc
                   
    • CSV files (Ameriflux), use BD for some gap-filing on the files and return result.

    • Think of a R client for BD. (PeCAn??)

    • Think of a MATLAB client 

    • Want to build a Web application on top of BD, (Similar to what Marcus build??)

    • Use BD convert to convert a collection of images with old format to png or pdf. convert odp/odt to ppt/doc

...