This page should apply to all geodashboard projects.
Parsers built for GLTG
Repositories
The legacy repository for GLTG parsers is located in https://opensource.ncsa.illinois.edu/bitbucket/projects/GEOD/repos/gltg-parsers-py/browse. Some of the parser sources have been updated recently and some of them have not been touched for years.
These should be migrated to https://opensource.ncsa.illinois.edu/bitbucket/projects/GEOD/repos/pygeotemporal-parsers/browse on update
Overview of GLTG parsers
- user = parsers (no password)
- root directory = /home/parsers
- directory structure
- 4 directories for 4 systems
- 3 parsers for 3 sources for each system
- 4 directories for 4 systems
- run parsers
- each system has a shell script that runs all three sources sequentially
- for each source all data is parsed first, then a subprocess runs the binning with a wait on subprocess until finished. When done the next source parser starts. No timeout between source parsers.
- cronjobs
- get greon data from gltg.ncsa.illinois.edu:/var/opt/CampbellSci/loggernet_ordered
- runs in as marcuss user (this can be changed but for now it is due to permissions on loggernet on gltg)
/home/marcuss/get_greon_data.sh
uses rsync to pull data to /home/marcuss/data/greon/ then copies to/home/parsers/greon-data/
- parsers
- 4 lines for 4 systems
- get greon data from gltg.ncsa.illinois.edu:/var/opt/CampbellSci/loggernet_ordered
General processes that take significant time
- updating sensor statistics - required at end of parsing to update start and end times https://opensource.ncsa.illinois.edu/bitbucket/projects/GEOD/repos/pygeotemporal/browse/pygeotemporal/sensors.py#239
- maybe it does more but not sure
- maybe the query can be simplified
System Parsing Times
Here the each source parses the data then waits for the subprocess that bins the data until finished. When the binning finishes, the the next source parser starts. No timeout between parsers.
- gltg-dev
- resources
- nebula, postgres (4 CPU, 8G RAM)
- times by source
- greon .5h
- iwqis 1h
- usgs ~1.5h (estimate as USGS is having some latency issues)
- resources
- ilnlrs-dev
- major problem - authentication of cache client timing out
- added a try with 10minute wait if timed out and try again, then another try ans wait 20minutes - still timed out and crashed the parser
- this was with the usgs parser only
- major problem - authentication of cache client timing out