Page History

...

CSV files uploaded to Clowder are annotated with information about the variables contained within the file using standard vocabularies.
This metadata, together with metadata about the location or sensor attached to a dataset is used to automatically ingest data into the Geostreaming API.
Given an annotated tabular file, apply format unit conversion to specific columns and create a new version of the tabular data.

Components

For example, if only 9 out of 10 columns match a prior mapping, likelihood is 90%
Or percentage of files seen with this type of mapping

Variables Mapping Service (VMS)
Jira
server JIRA
columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId b14d4ad9-eb00-3a94-88ac-a843fb6fa1ca
key BD-2310
- POST/GET/PUT/DELETE mappings
- The collection in MongoDB contains documents that represent mappings
  - Each mapping is a collection of mappings between strings (column headers) and standard vocabularies (uri terms)
  - How many times have seen a particular mapping (how many unique files)
  - When a mapping is not complete, i.e. we can only identify a subset of the columns, we should keep track of how many we columns we successfully identified
    - let's say a csv file has 10 columns, but we can only tag 4, we would have 40% accuracy
- Maybe keep a collection of what files match what mapping
- SEARCH for mappings that match a set of CSV headers and return them in order of accuracy
  - Client submits one list of CSV column names, service returns a list of potential mappings including accuracies.
- Dockerize the service:
  - Jira
    server JIRA
    columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId b14d4ad9-eb00-3a94-88ac-a843fb6fa1ca
    key BD-2318
Semantic Annotation Service (SAS)
- http://ecgs.ncsa.illinois.edu/SAS.html
- We should build a simpler version of this as a Flask application storing info in MongoDB
Datapoints Extractor (DPE)

Geostreaming Data Framework
- Store and visualize datapoints
- https://geodashboard.ncsa.illinois.edu/
- Geostreaming API (GSAPI)
Unit Conversion Extractor
- Given a CSV file and information about what units to convert ??? return a new file with the specific column converted to new units
- Requires ability to show derived files in GUI
- How does the user specify what units they want?

...