Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Polyglot refactoring
  2. Update extractors to latest techs
    1. JSONLD
    2. Docker containers
    3. Extractor metadata registration
    4. pyclowder
    5. Add status messages to all extractors and fix level granularity
      1. Make status constants (DONE, ERROR)
      2. Arcgis multiprocessing extractor
    6. Register on on demand queues
    7. Standardize around python logging
  3. Polyglot information loss
  4. Provenance
    1. Data wolf
      1. Polyglot: add file.jpg.log and file.jpg.wf to id
      2. Clowder: each step is one of the extractors executed on the specific file
      3. Check file format at every step
  5. Add new tools
    1. Look at the ones in Jira labeled as "Extractors" and "Converters"
    2. Praveen's new extractor
    3. Support students into doing this
  6. Move data vs move computation
    1. Long hanging fruit implementation?
    2. Host large files local?
  7. Logstash and Kibana
    1. Add log stash to the docker file
    2. Make extractors and software servers logs consistent
      1. Standardize around python logging
      2. Don't forget java extractors (versus, audio)
  8. BDFiddle
    1. Automatic Process Adjustments

      1. Multiple results panes

        1. Extraction Results

        2. Conversions Results

      2. Remove colon on Extractors/Converters

        1. Extract

        2. Convert To

      3. Flip conversion and extractors boxes for real estate

      4. Website Security

        1. Use an anonymous token/key with limits on file size and submissions. (Long Term - Not In Scope)

        2. Login using user/name and password

          1. Sign-In page first

          2. Get key

          3. Fetch token

          4. Key and token displayed on top of page

      5. Indent code snippet buttons to line up with code pane

      6. Links for setup by code snippets

    2. Manual Process
      1. Metadata (Extraction)
        1. Allow selection of multiple metadata tools
        2. Pick only one tool to start
        3. Display error from extractor if it fails -> Need clear errors in the extractors
        4. List each tool specifically -> Get tools from tool catelog
      2. Conversion
        1. Populate output (conversion) based on the input type of the file
        2. User will then select conversion format, which will then populate a list of tools to do the conversion
        3. Polyglot will give the list of available tools by conversion format