- Created by Rob Kooper, last modified on Apr 28, 2015
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 161 Next »
https://dap.ncsa.illinois.edu/conversion/:output_format/ https://dts.ncsa.illinois.edu/extraction/:domain/
The objective of this project is to construct a service that will allow for past and present un-curated data to be utilized by science while simultaneously demonstrating the novel science that can be conducted from such data. The proposed effort will focus on the large distributed and heterogeneous bodies of past and present un-curated data, what is often referred to in the scientific community as long-tail data, data that would have great value to science if its contents were readily accessible. The proposed framework will be made up of two re-purposable cyberinfrastructure building blocks referred to as a Data Access Proxy (DAP) and Data Tilling Service (DTS). These building blocks will be developed and tested in the context of three use cases that will advance science in geoscience, biology, engineering, and social science. The DAP will aim to enable a new era of applications that are agnostic to file formats through the use of a tool called a Software Server which itself will serve as a workflow tool to access functionality within 3rd party applications. By chaining together open/save operations within arbitrary software the DAP will provide a consistent means of gaining access to content stored across the large numbers of file formats that plague long tail data. The DTS will utilize the DAP to access data contents and will serve to index unstructured data sources (i.e. instrument data or data without text metadata). Building off of the Versus content based comparison framework and the Medici extraction services for auto-curation the DTS will assign content specific identifiers to untagged data allowing one to search collections of such data. The intellectual merit of this work lies in the proposed solution which does not attempt to construct a single piece of software that magically understands all data, but instead aims at utilizing every possible source of automatable help already in existence in a robust and provenance preserving manner to create a service that can deal with as much of this data as possible. This proverbial “super mutt” of software, or Brown Dog, will serve as a low level data infrastructure to interface with digital data contents and through its capabilities enable a new era of science and applications at large. The broader impact of this work is in its potential to serve not just the scientific community but the general public, as a DNS for data, moving civilization towards an era where a user’s access to data is not limited by a file’s format or un-curated collections.
- DeploymentsUnable to render {children}. Page not found: Deployments.
- Guides and NotesUnable to render {children}. Page not found: Guides and Notes.
- Discussions / Under Development
- Chrome Extension for DTS
- CI-BER Testbed
- Data and Tools
- Extractors v2
- Ideas for Cyberintegrator for BrownDog
- LIDAR Tools
- Live Examples
- Metadata Representation
- Metadata Schemas
- Obtaining Extractors Information- Heartbeats approach
- Polyglot Refactoring
- Polyglot / SoftwareServer Documentation
- REST Endpoint Consistency
- Relationships between Files
- Spaces
- Sprints
- Tool Catalog
- Tool (or service) to generate a River Profile
- Tools
- VM Elasticity
- Miscellaneous Topics
Blog Posts
-
Blog: NCSA Brown Dog and Box Skills Speed up Astronomical Research
created by
Apr 12, 2019
-
Blog: The Predictive Ecosystem Analyzer - PEcAn
created by
Mar 21, 2018
-
Blog: Using Machine Learning to Understand Public Preference Toward Landscape Design
created by
Feb 21, 2018
-
Blog: Brown Dog Tutorials
created by
Feb 01, 2018
-
Blog: What is Brown Dog Video
created by
Nov 06, 2015
-
Blog: IEEE Big Data
created by
Nov 06, 2015
-
Blog: Brown Dog Cheat Sheet
created by
Sep 16, 2015
-
Blog: To Be Heard and Not Seen
created by
Apr 27, 2015
-
Blog: Critcal Zone
created by
Apr 25, 2015
-
Blog: PEcAn
created by
Apr 25, 2015
-
Blog: Green Infrastructure
created by
Apr 25, 2015
-
Blog: Brown Dog Clients: A Command Line Interface
created by
Mar 18, 2015
-
Blog: The Brown Dog Tools Catalog
created by
Mar 05, 2015
-
Blog: Ecological Model Data Conversions: PEcAn
created by
Mar 03, 2015
-
Blog: Brown Dog Clients: Chrome Extension
created by
Mar 03, 2015
Recent Space Activity
- No labels