- Created by Rob Kooper, last modified by Kenton McHenry on May 15, 2017
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 236 Next »
The objective of this project is to construct a service that will allow for past and present, un-curated data to be utilized by science while simultaneously demonstrating the novel science that can be conducted from such data. The proposed effort will focus on the large distributed and heterogeneous bodies of past and present un-curated data. This data is often referred to in the scientific community as "long-tail data": data that would have great value to science if its contents were readily accessible. The proposed framework will be made up of two re-purposable cyberinfrastructure building blocks referred to as a Data Access Proxy (DAP) and Data Tilling Service (DTS). These building blocks will be developed and tested in the context of three use cases that will advance science in geoscience, biology, engineering, and social science. The DAP will aim to enable a new era of applications that are agnostic to file formats through the use of a tool called a Software Server which itself will serve as a workflow tool to access functionality within 3rd party applications. By chaining together open/save operations within arbitrary software the DAP will provide a consistent means of gaining access to content stored across the large numbers of file formats that plague long tail data. The DTS will utilize the DAP to access data contents and will serve to index unstructured data sources (i.e. instrument data or data without text metadata). Building off of the Versus content based comparison framework and the Medici extraction services for auto-curation the DTS will assign content specific identifiers to untagged data allowing one to search collections of such data.
The intellectual merit of this work lies in the proposed solution which does not attempt to construct a single piece of software that magically understands all data, but instead aims at utilizing every possible source of automatable help already in existence in a robust and provenance preserving manner to create a service that can deal with as much of this data as possible. This proverbial “super mutt” of software, or Brown Dog, will serve as a low level data infrastructure to interface with digital data contents and through its capabilities enable a new era of science and applications at large. The broader impact of this work is in its potential to serve not just the scientific community but the general public as a DNS for data. Ultimately the goal is to move civilization towards an era where a user’s access to data is not limited by a file’s format or un-curated collections.
- Discussions / Under Development
- Alpha Beta Information Loss
- Beta Release
- Brown Dog API (Proposed Changes)
- Chrome Extension for DTS
- CI-BER Testbed
- Data and Tools
- Data Fetching Service & Data Movement Overall
- Deployment
- Extractors v2
- Fence User Quotas
- JSON-LD Support
- Ideas for Cyberintegrator for BrownDog
- LIDAR Tools
- Live Examples
- Metadata Encoding
- Metadata Representation
- Metadata Schemas
- Moving Computation Towards Data
- Obtaining Extractors Information- Heartbeats approach
- Polyglot Refactoring for Dockerizing SSes
- Polyglot Refactoring based on Clowder Architecture
- Polyglot / SoftwareServer Documentation
- REST Endpoint Consistency
- Relationships between Files
- Request a type of data be supported by the DAP and/or DTS
- Spaces
- Sprints
- Story Boards
- Transformations
- Tool Catalog
- Tool (or service) to generate a River Profile
- Tools
- VM Elasticity
- Miscellaneous Topics
Blog Posts
-
Blog: NCSA Brown Dog and Box Skills Speed up Astronomical Research
created by
Apr 12, 2019
-
Blog: The Predictive Ecosystem Analyzer - PEcAn
created by
Mar 21, 2018
-
Blog: Using Machine Learning to Understand Public Preference Toward Landscape Design
created by
Feb 21, 2018
-
Blog: Brown Dog Tutorials
created by
Feb 01, 2018
-
Blog: What is Brown Dog Video
created by
Nov 06, 2015
-
Blog: IEEE Big Data
created by
Nov 06, 2015
-
Blog: Brown Dog Cheat Sheet
created by
Sep 16, 2015
-
Blog: To Be Heard and Not Seen
created by
Apr 27, 2015
-
Blog: Critcal Zone
created by
Apr 25, 2015
-
Blog: PEcAn
created by
Apr 25, 2015
-
Blog: Green Infrastructure
created by
Apr 25, 2015
-
Blog: Brown Dog Clients: A Command Line Interface
created by
Mar 18, 2015
-
Blog: The Brown Dog Tools Catalog
created by
Mar 05, 2015
-
Blog: Ecological Model Data Conversions: PEcAn
created by
Mar 03, 2015
-
Blog: Brown Dog Clients: Chrome Extension
created by
Mar 03, 2015
- No labels