XSEDE Workshop - July
Review - Mid October
User Workshop - After October
Priorities:
- Polyglot refactoring
- Update extractors to latest techs
- JSONLD
- Docker containers
- Extractor metadata registration
- pyclowder
- Add status messages to all extractors and fix level granularity
- Make status constants (DONE, ERROR)
- Arcgis multiprocessing extractor
- Register on on demand queues
- Standardize around python logging
- Polyglot information loss
- Provenance
- Data wolf
- Polyglot: add file.jpg.log and file.jpg.wf to id
- Clowder: each step is one of the extractors executed on the specific file
- Check file format at every step
- Data wolf
- Add new tools
- Look at the ones in Jira labeled as "Extractors" and "Converters"
- Praveen's new extractor
- Support students into doing this
- Move data vs move computation
- Long hanging fruit implementation?
- Host large files local?
- Logstash and Kibana
- Add log stash to the docker file
- Make extractors and software servers logs consistent
- Standardize around python logging
- Don't forget java extractors (versus, audio)