Which Data Repository to Choose?

Many journals and funding agencies are requiring researchers to deposit their data in publicly accessible databases or repositories. This not only helps to ensure the long-term accessibility and preservation of the data but also increases its discoverability and reuse.

The number of sustainable online repositories available to host and archive research data may seem overwhelming. Guidance for repository selection is offered below. Also available is the HSLS Data Management Repository Web site. Note: before selecting a repository, researchers should review the deposit directions and policies for the specific repository.

  • Always check to see what might be mandated.

Some federal agencies, policies, grants, or journals may specify or suggest where data should be deposited. For example, the National Institutes of Health (NIH) Genomic Data Sharing Policy expects “that genomic research data from NIH-supported studies involving human specimens as well as non-human and model organisms will be submitted to an NIH-designated data repository” and provides a list of relevant databases.

  • Consider discipline-specific or research model-specific repositories.

For many disciplines, there are repositories familiar to and well-used by researchers in the field (e.g., the CardioVascular Research Grid). In addition, there are repositories for specific research models (e.g., the Zebrafish Model Organism Database).

If you are new to an area of research or simply unfamiliar with the repositories in your field, search re3data.org, a global registry of research data repositories. This resource will help you locate appropriate repositories and highlight the key characteristics of each, such as if the repository provides open, restricted, or closed access to its data.

  • Turn to a general data repository if no discipline-specific or research model-specific one exists.

There are several general or multidisciplinary databases available and widely known. A few to consider are: DRYAD, figshare, and Mendeley Data (beta).

For more information, e-mail Melissa Ratajeski, coordinator of data management services, at mar@pitt.edu or call 412-648-1971.

~Melissa Ratajeski