What is a data repository? According to the E-Science Thesaurus, a data repository can be broadly “defined as a place that holds data, makes data available to use, and organizes data in a logical manner.”1 The National Institutes of Health (NIH) further defines repositories by level of security to accommodate sensitive data:2
- Data archive—a place where machine-readable data are acquired, manipulated, documented, and finally distributed to the scientific community for further analysis.
- Data enclave—a controlled, secure environment in which eligible researchers can perform analyses using restricted data resources.
In accordance with the NIH and the National Science Foundation policies requiring that research data developed with federal funds be shared with other researchers, data repositories provide the technical platform that enables the sharing, discovery, validation, and reuse of data. They also support greater efficiency throughout the scientific process.
What advantages does a data repository offer a health sciences researcher? Besides convenient storage and facilitated, professional long-term preservation for your research data, a data repository provides:
- Updates to new data formats
- Enhanced discoverability
- Increased citation rates
- Access to a variety of datasets to explore
- Ability to reuse validated and unique datasets
- More efficient workflow
When selecting a data repository, first check for funder, journal, or institutional requirements, and maintain compliance with your research protocols. General data repositories as well as subject-specific repositories are represented in the searchable directories listed below.
General:
Directories:
- Databib—searchable global registry
- DataCatalogs.org—global open government data
- NIH Data-Sharing Repositories—table of NIH-supported data repositories
- Re3Data.org (Registry of Research Data Repositories)—searchable global registry
For previous articles on data management published in the HSLS Update, please see:
- Data Management Planning, February 2013
- Metadata, March 2013
- Storage, Backup, and Security, May 2013
- Data Ownership, July 2013
- Data Sharing, September 2013
- Data Management Planning: Privacy and Ethical Issues November 2013
1. E-Science Thesaurus: Data Repository. E-Science Portal for New England Librarians. Last updated: Sep 5, 2013. Accessed Jan. 7, 2014.
2. Definitions: NIH Data Sharing Policy and Implementation Guidance. National Institutes of Health (NIH). Bethesda, MD. Last updated: March 5, 2003. Accessed Jan. 7, 2014.
~ Andrea M. Ketchum