https://bd-api.ncsa.illinois.edu/dap/ https://bd-api.ncsa.illinois.edu/dts/
The objective of this project is to construct a service that will allow for past and present, un-curated data to be utilized by science while simultaneously demonstrating the novel science that can be conducted from such data. The proposed effort will focus on the large distributed and heterogeneous bodies of past and present un-curated data. This data is often referred to in the scientific community as "long-tail data": data that would have great value to science if its contents were readily accessible.
The proposed framework will be made up of two re-purposable cyberinfrastructure building blocks referred to as a Data Access Proxy (DAP) and Data Tilling Service (DTS). These building blocks will be developed and tested in the context of three use cases that will advance science in geoscience, biology, engineering, and social science.
The intellectual merit of this work lies in the proposed solution which does not attempt to construct a single piece of software that magically understands all data, but instead aims at utilizing every possible source of automatable help already in existence in a robust and provenance preserving manner to create a service that can deal with as much of this data as possible. This proverbial “super mutt” of software, or Brown Dog, will serve as a low level data infrastructure to interface with digital data contents and through its capabilities enable a new era of science and applications at large. The broader impact of this work is in its potential to serve not just the scientific community but the general public as a DNS for data. Ultimately the goal is to move civilization towards an era where a user’s access to data is not limited by a file’s format or un-curated collections.