Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • We will provide a VM with everything pre-installed in it through Nebula. 
    •  Rob Kooper will talk to Doug for this if we can spawn 50 VMs on Nebula for the tutorial session. (DONE) We will get  50 VMs on Nebula.
    •  Smruti Padhy Order 50+ flash drive for back up that will contain the VMs
    •  Smruti PadhyMarcus Slavenas, (who else?) Create a VM with everything installed in it and take a snapshot which will then be deployed within Nebula. Approx. time required - 2 days
    •  Convert the VM created (using Openstack image) to virtualBox format (*vdi) and test the configurations.
    •  Make Smruti Padhy (who else??) Make a list of all softwares required and the directory structure for the tutorial
    •  Write a script for deployment of 50 VMs from the VM snapshot that we created. 
    •  Local installation of fence with local authentication. This is for backup to be provided in the preinstalled VM.
    •  Testing BD service with 50 concurrent users to perform conversion/extractions tasks
    •  Luigi Marini Maximum size of file that can be uploaded to Brown Dog needs to be controlled. This is require to ensure no one uploads any large files.
    •  Not sure of Jetstream yet. 
    • Provide clear instructions as how to access VMs in Nebula with proper credentials.
      •  Clear Instruction of how to access the VMs (e.g., through ssh), from different OSes.
      •  Need training accounts on nebula. Provide SSH key-pairs to each participant.
    • (Before tutorial - wiki pages with clear instructions) Installs Python/R/MATLAB/cURL to use BD Service along with the library required in case any one interested in using the BD services in future.
      •  Create wiki pages with clear instructions

...

  • Demonstration of use of BD Fiddle (20 mins)
    • Sign up for Brown Dog Service
    • Obtain a key/token using curl or Postman or use of IPython notebook
    • Use token and bd fiddle interface to obtain to see BD in action. 
    • Copy paste the python code snippet and use it the application to be explained next. 
    •  Training account for BD service (note this is different than Nebula account)
    •  Create a document for the demo with step-by-step screenshots for all above steps.
    •  Eugene Roeder Fix the CORS error for file url option (I think it is a known issue). Please add the JIRA issue number here.
    •  Fix the delay experienced when file is uploaded from local directory to the bdfiddle ui

...

  • Create applications using BD services (50 mins)
    • Conversion Example (15 mins):  To convert a collection of images/ps/odp/audio/video files to png/pdf/ppt/mp3.  This will demonstrates that if you have a directory with files in old file formats, just use BD to get it all converted without requiring to install any software. (Emphasizes on conversion)
      • Make sure imagemagick and ffmpeg converters are running before the demo
      • Obtain the 
    • TODO:
      •  Provide a Python script for this and let participants use python library to use the  BD service
      •  Provide a Step-by-step instructions with screenshot to do this

    • Extraction/Indexing/Retrieval Example (20 mins):

       

      Emphasizes on extraction on unstructured data, indexing and content-based retrieval)

      • Make sure OCR, Faceface, and speech2text extractors are running before starting the demo
      • One can upload images from local directory to obtain images or use external web service.  
      • Let the participant use the python library of BD to obtain key/token and submit extraction request to BD-API gateway
      • Once technical metadata is obtained from BD, write the tags and technical metadata to a local file /python dictionary.
      • Search the file based on the tags/technical metadata by linear search on the index file

      • TODO
        •  Create an example dataset with images and audios with to which we can make interesting query
        •  (Optional) Provide a code snippet of using externel service to obtain images. e.g. Flicker API.
            •  This will only be provided as an example and will not be used for the rest of the code.
        •  Provide the link to the current BD REST API and create a document/wiki page showing step-by-step screenshots of obtaining a key/token using python library.
        •  Write a Python script that will serve as a stub for the BD client
          • The participant will fill in the code to use python library to call BD REST API call to and submit their requests. 
          •  
          Write a python script that will index
          • The python script  should write the tags and technical metadata to a local file. (Probably can use
          BD-CLI Library).
    • Problem 2 : Given a collection of text files from a survey or reviews for a book/movie, use sentiment analysis extractor to calculate the sentiment value for each file and group similar values together. (Emphasizes on extraction on unstructured data and useful analysis )
          • python library's index method that writes it as feature vectors.)
        •  Write a Python script to make interesting search/query to the index file. Again probably use the python library's find method or just read the local file. 

    • A collection of text files with reviews
      •  Obtain an examples dataset from the web.
    • Let the participant use the python library of BD to obtain key/token and submit request to BD-API gateway
      •  Provide the link to the current BD REST API and create a document/wiki page showing step-by-step screenshots of obtaining a key/token using python library.
      •  Write a Python script that will serve as a stub for the BD client
          • The participant will fill in the code to BD REST API call to submit their requests.
    • Make sure the Sentiment Analysis extractor is running
    • Saves the results for each text file in a single file with corresponding values
      •  Provide code for this in stub script
    • Create  separate folders and move the file based on the sentiment value
      •  Provide a code that will do the above action in the stub
    • (Optional) Index text files along with the sentiment values and use ES visualization tool to search for documents with sentiment value less than some number.
    • Problem 4: Given a collection of *.xlsx files, obtain some results based on some columns value. (Emphasizes on extraction and analysis on scientific data)

...

  • Tutorial feedback form
  • Announcement of next user workshop

 

OPTIONAL EXAMPLES

  • Problem 2 : Given a collection of text files from a survey or reviews for a book/movie, use sentiment analysis extractor to calculate the sentiment value for each file and group similar values together. (Emphasizes on extraction on unstructured data and useful analysis )
    • A collection of text files with reviews
      •  Obtain an examples dataset from the web.
    • Let the participant use the python library of BD to obtain key/token and submit request to BD-API gateway
      •  Provide the link to the current BD REST API and create a document/wiki page showing step-by-step screenshots of obtaining a key/token using python library.
      •  Write a Python script that will serve as a stub for the BD client
          • The participant will fill in the code to BD REST API call to submit their requests.
    • Make sure the Sentiment Analysis extractor is running
    • Saves the results for each text file in a single file with corresponding values
      •  Provide code for this in stub script
    • Create  separate folders and move the file based on the sentiment value
      •  Provide a code that will do the above action in the stub
    • (Optional) Index text files along with the sentiment values and use ES visualization tool to search for documents with sentiment value less than some number.