Abstract

Many challenges hinder the seamless integration of models with data. These challenges compel scientists to perform the integration process manually. The primary challenges are a consequence of the knowledge latency between model and data resources and others are derived from inadequate adoption and exploitation of information technologies. Knowledge latency challenges increase exponentially when a user aims to integrate long-tail data (data collected by individual researchers or small research groups) and long-tail models (models developed by individuals or small modeling communities). We focus on these long-tail resources because despite their often-narrow scope, they have significant impacts in scientific studies and present an opportunity for addressing critical gaps through automated integration. The goal of this research is to develop a framework rooted in semantic techniques and approaches to support “long-tail” models and data integration.  Our vision is to develop a decentralized knowledge-based platform that can be easily adopted across geoscience communities comprising of individual and small group researchers, to allow semantically heterogeneous system to interact with minimum human intervention. It will allow the automatic reference of data from data resources to model by: (i) leveraging the Semantic Web; (ii) developing an automated semantic mediation tool; and (iii) developing a semantic knowledge discovery system that can be used by long-tail models. 

Science Challenges

Our goal is to enable the integration of long-tail data, i.e. data collected by individual researchers or small research groups, and long-tail models, i.e. models developed by individuals or small modeling communities, using a framework rooted in semantic techniques. We focus on these long-tail resources because despite their often-narrow scope, they have significant impacts in scientific studies and present an opportunity for addressing critical gaps through automated integration. We aim is to develop a decentralized knowledge-based platform that can be easily adopted across geoscience communities comprising of individual and small group researchers, to allow semantically heterogeneous system to interact with minimum human intervention.

Goals and Vision

Develop a decentralized knowledge-based platform that allows semantically heterogeneous systems to interact with minimum human intervention. 

We will build on two existing technologies: 

 We will also integrate with ongoing EarthCube initiatives including GeoSoft, Earth System Bridge, SEN (Sediment Experimentalist Network), and eWELL (Workforce Education and Learning Library).

Design Overview 

The framework consists of three layers:

Key technologies used in the framework

Contribution

   Scientific Contribution

Geosemantics framework will directly augment the multidisciplinary interaction between different geoscience communities by minimizing the human intervention in semantic mediation between resources and their context ambiguity, and supporting the ``crosswalks'' among geoscience Standard Names.

   Technical contribution