You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

These design notes concern exposing the NBI data ( NDS-992 - Getting issue details... STATUS  ) to users via workbench.

See also Shared data directories


Download the NBI data

git clone https://github.com/kaleoyster/ProjectNBI

python ./nbiCsvJsonConverter-2/Downloadv1.py

Creates directory NBIDATA containing the raw data

python3 ./nbiCsvJsonConverter-2/ProcessMain.py

Converts the data to JSON format

Getting the data to Workbench

We can initially add the shared directory to Gluster and transfer or download the data directly. 

For this project, the raw data is probably less of a concern than the Mongo data – which poses an interesting question about data sharing.

We could setup a globus endpoint or use Santiago?

Import into MongoDB

We will host the raw data under /shared/NBIDATA/ as a read-only volume

We will also host an instance of MongoDB in the "public"? namespace with the official database

Users can access via nbidata.public

This will require running a process to ingest the data.

The "public" namespace can have no service timeouts

Mongo must be accessible to all namespaces, even after we apply network security policies.

Metadata record

Do we host a record describing this dataset

Data citation/identifiers

Is this a different version

How do we deal with access? In this case, there's nothing to worry about with this dataset, but in the future.

Can someone still use this in 5-10 years

How do we upgrade Mongo, 

This is active data vs a live database

Metadata via Globus?

Need to point to FHWA and github repo

Github repo needs to be tagged/versioned.

  • No labels