October Review 2016

XSEDE Workshop - July

Review - Mid October

User Workshop - After October

Priorities:

Polyglot refactoring
Update extractors to latest techs
1. JSONLD
2. Docker containers
3. Extractor metadata registration
4. pyclowder
5. Add status messages to all extractors and fix level granularity
  1. Make status constants (DONE, ERROR)
  2. Arcgis multiprocessing extractor
6. Register on on demand queues
7. Standardize around python logging
Polyglot information loss
Provenance
1. Data wolf
  1. Polyglot: add file.jpg.log and file.jpg.wf to id
  2. Clowder: each step is one of the extractors executed on the specific file
  3. Check file format at every step
Add new tools
1. Look at the ones in Jira labeled as "Extractors" and "Converters"
2. Praveen's new extractor
3. Support students into doing this
Move data vs move computation
1. Long hanging fruit implementation?
2. Host large files local?
Logstash and Kibana
1. Add log stash to the docker file
2. Make extractors and software servers logs consistent
  1. Standardize around python logging
  2. Don't forget java extractors (versus, audio)

Page tree