  • Formatting was changed.


Brown Dog’s goal is to construct two services, services a Data Transformation Service (DTS) that will play roles a role similar to a Domain Name Service (DNS), existing as part of the Internet’s backbone for the purpose of making the contents of un-curated data collections more accessible. The first, the Data Access Proxy (DAP), will build DTS will support data conversions by building off of a technology called a Polyglot Software Server which replaces an applications native interface with a uniform interface that can be easily programmed against. Using Software Servers the DAP DTS will chain together open/save operations within software applications for the purpose of seamlessly transforming unreadable files to readable ones, making future browsers and applications more agnostic to file formats. The secondSecond, the Data Tilling Service ( DTS ) will build support metadata extractions by building off of two technologies: Versus for content based retrieval, and Medici for auto-curation. Using these technologies the DTS will the Clowder framework for data management, sharing, curation, and publication. Using Clowder the DTS will be able to serve as an active repository for both housing and utilizing content analysis software within the community for the purpose of indexing and automatically assigning metadata to un-curated collections. Together   Supporting these two services will types of transformations will allow the DTS to act as a DNS for data, translating in-accessible un-curated data into information in a manner that is provenance preserving so as to ensure reproducible science and enable new science over our vast collections of un-curated digital data. The intellectual merit of this work lies in the proposed solution which does not attempt to construct a single piece of software that magically understands all data, but instead aims at utilizing and interconnecting every possible source of automatable help already in existence (i.e. software, libraries, code) in an extensible, robust, and scalable manner to create a service that can deal with as much of this data as possible. This proverbial “super mutt” of software, or Brown Dog DTS, will serve as a low level infrastructure to data enabling a new era of science and applications which will not only make use of, but rely upon un-curated data sources. The broader impact of this work is in its potential to serve not just the scientific community but the general public, as a DNS for data, moving civilization towards an era where a user’s access to data is not limited by a file’s format or un-curated collections. Three use cases spanning geoscience, biology, engineering, and social science are proposed to both drive and demonstrate the novel science that can be obtained with this underlying cyberinfrastructure.