 Sharing research data can bring many benefits, including greater visibility for data creators, a more transparent research process, and opportunities to identify potential collaborators. But what about datasets that are stored on a lab server instead of in a data repository, or that should only be shared with vetted researchers? The Pitt Data Catalog is a new platform at HSLS designed to help Pitt health sciences researchers share and discover their otherwise hard-to-find datasets, while keeping ultimate control over the data in researchers’ hands.
Sharing research data can bring many benefits, including greater visibility for data creators, a more transparent research process, and opportunities to identify potential collaborators. But what about datasets that are stored on a lab server instead of in a data repository, or that should only be shared with vetted researchers? The Pitt Data Catalog is a new platform at HSLS designed to help Pitt health sciences researchers share and discover their otherwise hard-to-find datasets, while keeping ultimate control over the data in researchers’ hands.
“The Pitt data catalog has the potential to improve research collaborations and accelerate the impact of research being conducted in the schools of the health sciences. I strongly encourage each researcher to work with HSLS to make your datasets discoverable through the catalog in accordance with the FAIR Data Principles: Making Data Findable, Accessible, Interoperable and Reusable.” Dr. Arthur Levine, Senior Vice Chancellor for the Health Sciences
Unlike data repositories like Dryad or Zenodo, the Pitt Data Catalog does not host any data files. Instead, each dataset included in the catalog is described in a metadata record that includes information about the dataset’s authors, subject domain, and data creation process, as well as instructions for accessing the dataset itself and links to associated publications. Some data catalog entries describe publicly-available datasets, so their records link directly to the data in a repository. Other entries that describe privately-held datasets may direct a visitor to e-mail the corresponding author, or link to a data-access application form. Each record is created in collaboration with the researcher to ensure accurate and comprehensive information.
If you have datasets you would like to have described in the Pitt Data Catalog, please contact the HSLS Data Services team at HSLSDATA@pitt.edu or through our dataset inclusion form. We’ll schedule an in-person or phone consultation to learn more about your datasets and discuss the most appropriate terminology to describe your data. After we create a draft of your dataset’s record, we’ll send it to you for final approval. If you have updates after the record is published, just contact us to make changes; we may also contact you to make sure our information is still current.
HSLS Data Services staff are happy to give demonstrations for individual health sciences researchers, departments, or labs. If you would like to investigate whether the Pitt Data Catalog would be a good match for your datasets, please reach out and we will gladly explore its possibilities with you.
The University of Pittsburgh, Health Sciences Library System, is a member of the Data Catalog Collaboration Project and has customized this data discovery tool in part with Federal funds from the National Library of Medicine, National Institutes of Health, Department of Health and Human Services, under cooperative agreement number UG4LM012342 with the University of Pittsburgh, Health Sciences Library System. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
~Helenmary Sheridan