You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
« Previous
Version 2
Next »
Goals
- CSV files uploaded to Clowder are annotated with information about the variables contained within the file using standard vocabularies.
- This metadata, together with metadata about the location or sensor attached to a dataset is used to automatically ingest data into the Geostreaming API.
Components
- Clowder
- Dataset is annotated with sensor information
- Reuse existing relationship between dataset and sensor
- Or... add metadata to dataset
- Variable Annotation Extractor (VAE)
- Annotate files with entries from standard vocabularies
- Col. 3 contains term http://odm2/precipitation
- Multiple mappings can be provided, each with their own likelihood
- For example, if only 9 out of 10 columns match a prior mapping, likelihood is 90%
- Or percentage of files seen with this type of mapping
- Variables Mapping Service (VMS)
- Tracks mappings between strings (column headers) and standard vocabularies (uri terms)
- Semantic Annotation Service (SAS)
- Datapoints Extractor (DPE)
- Creates datapoints in the Geostreaming API based on rows in the CSV input file
- Requires mapping from Variable Annotation Extractor
- Site information as metadata on dataset
- Geostreaming Data Framework
Workflow
- File F1 (CSV) uploaded to dataset D1
- VAE reads headers in
- VAE requests matching mappings from mapping service VMS
- VAE adds metadata entries to file F1
- DPE extracts datapoints from CSV and adds them to GSAPI
Tasks
- Update https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/extractors-csv to store more information
- which column has which header
- include column number and label, for example (3, "temperature)
- Develop Variables Mapping Service (VMS)
- Simple flask app with mongodb back end
- Variable Annotation Extractor (VAE)
- En extension of the extractor-csv that queries the VMS and stores standard names in metadata
- We should support multiple mappings added to metadata
- Figure out where the frontend should be
- Standalone client
- Clowder add metadata widget