This page is intended to capture information related to

Jira

server	JIRA
columns	key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId	b14d4ad9-eb00-3a94-88ac-a843fb6fa1ca
key	NDS-211

.

Overview

The goal of this project is to develop a general-purpose research data repository "recommender" service to be hosted by the NDS. The basic use case is a researcher that has data that they want to deposit, but they do not know where to put it. A few possible use cases:

There is no existing community repository
The data doesn't fit the researcher's usual repository. For example, someone working in a new interdisciplinary space or has data they believe might be useful to another community.
Novice or "lazy" user – however, most advice from these users will come from social media, conferences, and training.

There are several existing services in this space including the Registry of Research Data Repositories (RE3Data), Biosharing.org, and the SEAD C3PR service. Informal discussions with U of I Research Data Service makes the following recommendation:

"Deposition of data into a web-accessible repository is generally the preferred mechanism for public data sharing because it ensures wide-spread and consistent access to the data. If your discipline already has a trusted repository, we recommend you deposit where your community knows to look. To find a repository, re3data.org is a large, vetted, and searchable catalog of data repositories. If no discipline-specific repository exists, there are several options, including Illinois’ IDEALS repository (free) and other general-purpose repositories like DataDryad (fee-based)."

In addition to these existing registries of research data repositories, funding agencies and publishers provide lists of recommended repositories.

To be useful, the NDS repository recommender must differentiate itself from these existing tools and services. For example

Improved search over Re3Data through the use of priors (e.g., "trustworthiness" or some sort of impact factor)
Accounting for user motivations (funding agency requirements, publisher requirements, data size) through guided search

Background

What tools already exist in this space?

...

Registry	Description	Notes
Re3Data	Registry of research data repositories	Started from Databib, crowd-sourced. Metadata is too general for search; user feedback "precision is horrible"; not based on natural language
Biosharing.org	Registry of databases and policies for life/environmental/bio sciences	Schema based on BioDBCore: http://biocuration.org/community/standards-biodbcore/ Data is not available, but will be. BioSharing: curated and crowd-sourced metadata standards, databases and data policies in the life sciences
Cinergi		Community Inventory of EarthCube Resources for Geosciences Interoperability	Curated database of geoscience information resources Used
OpenAIRE		OpenAIRE data provider search	Publishes guidelines for data archives
LA Referencia

There are (at least) two major registries of research data repositories. Publishers and funding agencies often direct researchers to search for repositories using these tools:

...

The schema is based on BioDBCore: http://biocuration.org/community/standards-biodbcore/,
License: Creative Commons by Share Alike 4.0
See also:
- BioSharing: curated and crowd-sourced metadata standards, databases and data policies in the life sciences

...


bioCADDIE	Data discovery index	Index of data "do for data what pubmed did for literature"
OpenDOAR	Directory of open-access repositories
SHARE		Index of research activities/outputs including data management plans, grant proposals, preprints, presentations, and data repository deposits

...

Publishers refer to both in their lists of recommended repositories, but both services appear to be intended for librarians, curators, publishers and funding agencies instead of the average researcher. The re3data is easily available for download and could be incorporated into our system. It's not clear whether the Bioshare data is available (technically, it could be crawled).

...

(This list is not exhaustive – it's likely that many publishers, agencies, and organizations will provide similar lists):


NIH

...

https://www.nlm.nih.gov/NIHbmic/nih_data_sharing_repositories.html

...

Note that the Biosharing database already includes information about whether a repository is recommended by a funding agency:

Elsevier

...

	https://www.elsevier.com/?a=57755 https://www.elsevier.com/books-and-journals/content-innovation/data-base-linking/supported-data-repositories http://www.journals.elsevier.com/data-in-brief/policies-and-guidelines/public-repositories-to-store-and-find-data
Nature	http://www.nature.com/authors/policies/availability.html http://www.nature.com/sdata/policies/repositories http://www.nature.com/sdata/policies/data-policies
PLOS	http://blogs.plos.org/everyone/2015/07/02/plos-recommended-data-repositories/ http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories
Libraries	https://library.uoregon.edu/datamanagement/sharingdata.html http://www.library.cmu.edu/datapub/dms/respositories
Other	http://www.ijdc.net/index.php/ijdc/article/viewFile/9.1.152/349 http://www.rdc-drc.ca/wp-content/uploads/Review-of-Research-Data-Repositories-2015.pdf AMS: https://www.ametsoc.org/ams/index.cfm/publications/authors/journal-and-bams-authors/journal-and-bams-authors-guide/data-archiving-and-citation/ AGU: http://publications.agu.org/files/2014/06/Data-Repositories.pdf http://openarchaeologydata.metajnl.com/about/#repo https://www.datacite.org/services/find-repository.html

Note that the Biosharing database already includes information about whether a repository is recommended by a funding agency:

SEAD C3PR/Matchmaker

...

SEAD Publication API

Other sources of information:

What other sources of information might we include in a recommender service?

Researcher identifiers, such as ORCID Persistent digital identifier for researchers: these might be helpful in collecting researcher profile information that can be used for recommendation.
Journal/publication information: We can relate specific journals to data repositories. If the user is publishing in a specific journal, we can recommend where to put the data.
Abstract: Use text matching techniques to match an abstract to a repository.
https://www.datacite.org/
BrownDog: Can we use information from extractors to identify criteria for recommendation?

...

Space shortcuts

Page tree

Versions Compared

Old Version 24

New Version 25

Key

Overview

Background

What tools already exist in this space?

SEAD C3PR/Matchmaker

SEAD Publication API

Other sources of information:

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 24

New Version 25

Key

Overview

Background

What tools already exist in this space?

SEAD C3PR/Matchmaker

SEAD Publication API

Other sources of information: