Development of pyGeodashboard started on 2016-02-04. It is a library that contains the basic functions needed for parsing sensors, streams, and datapoints to the geostreaming API.
- Step 1: Outline Parser Functions
- Create a outline that describes the process of parsing with focus on separating reusable portions of a parser from those that are particular to a specific data source.
Outline Parser Functions
Functions will be described as unique= code particular to the source is needed and general= should be able to run as part of every source
- Get data from source (unique)
- Parsing begins by getting the data from the source. Two types of data are needed:
- data that describes the site such as geocodes, name, and source.
- measurements
- The format and retrieval method varies from source to source
- Some source formats
- API for a single station (USGS,NOAA)
- API for mixed stations (Water Quality Portal)
- Files stored to server with loggetnet (GREON)
- csv download (LRTM)
- Some source formats
- Parsing begins by getting the data from the source. Two types of data are needed:
- Parse data to sensor (unique and general)
- Up till now, this has been a unique process for each source; however, this portion should be broken into 2 portions
- reformat data into a standard that can be input into general parser (unique)
- parse data to sensor json (general)
- Up till now, this has been a unique process for each source; however, this portion should be broken into 2 portions
- Parse data to stream(s) (unique abd general)
- Similar to parse to sensor, with the main difference being that sources can have multiple stream for different reasons.
- For example:
- GREON uses 2 streams - one for water quality data and one for environmental data
- USGS uses 5 streams - water quality measurements, gap filled nitrate, gap filled discharge, load, and cumulative load
- two different conventions have been used, and need to be standardized
- GREON names the streams differently: GREON-07_MD or GREON-07_WQ
- USGS puts a data_type key in properties with possible values: source_data, fill_nitrate, fill_discharge, calc_load, and calc_cumul_load
- Probably should be discussed and decided.
- For example:
- Currently, each source has it's own implementation for all, like sensors, it should be broken into unique and general portions
- get stream data from source including determining number of streams needed and parsing data to standard format (unique)
- parse data to stream json (general)
- Similar to parse to sensor, with the main difference being that sources can have multiple stream for different reasons.
- Parse data to datapoints