This information is over 2 years old.
Information was current at time of publication.

Forecasting Data Costs for Biomedical Data Preservation

A data management plan is a formal document outlining how you will handle your data both during your research and after the project is completed. While writing this plan, and most importantly while preparing your grant application, it’s important to think through the long-term costs that might be associated with managing and preserving data throughout its life-cycle and the resources needed (both physical and personnel) to do so.

A new consensus study report from the National Academies of Sciences, Engineering, and Medicine titled “Life-Cycle Decisions for Biomedical Data: The Challenge of Forecasting Costs” may be useful to researchers trying to accomplish this task. The report provides a framework to “help researchers identify and think through the major decisions in forecasting life-cycle costs for preserving, archiving, and promoting access to biomedical data.”

In addition to the report there are many other valuable tools/guides linked under the “resources” tab on the National Academies Press page. Of particular interest are:

Continue reading

This information is over 2 years old.
Information was current at time of publication.

NCBI Datasets: Making Genomic Data Download Easy

Organisms with datasets: homo sapiens (human), mus musculus (house mouse), Arabidopsis thaliana (thale cress), and Rattus norvegicus (Norway rat)There are challenges with downloading genomic data. File sizes are large, and it can be time consuming to retrieve multiple files. Sometimes downloads fail. A custom script may be required. Fortunately, a solution to all of these frustrations is now available—NCBI Datasets.

This experimental resource allows users to easily download eukaryotic genome sequence and annotation data by assembly accession, taxonomic name (scientific and common), or taxonomy ID. The web interface allows for browsing by organism, with the most common experimental species conveniently available from the main page. For example, try selecting the house mouse (mus musculus), then select all 22 associated assemblies. Options for the type of data for the download include genomic, transcript, and protein sequences as well as annotation features. Continue reading

This information is over 2 years old.
Information was current at time of publication.

Build a Better Research Process with HSLS Data Services

Across the diverse fields served by the Health Sciences Library System, one thing is universal: good science depends on good data. Whether you are embarking on your first research project or have dozens of completed studies under your belt, the HSLS Data Services team is here to help you improve the efficiency and reliability of your data-handling workflows at every step in the research process. We offer consultations, classes, and customized trainings on data topics including:

  • Organizing and describing files and data—always an important practice, but especially critical at a time when many researchers are working in multiple locations, on distributed teams, or on multiple computers and file servers. These workshops are also recommended for new graduate students to set themselves up with good habits from the beginning.
  • Writing a data management plan for funders and publishers, including pre- and post-submission review using DMPTool.

Continue reading

This information is over 2 years old.
Information was current at time of publication.

Share with Flair with FAIR-Aware

Whether you’re new to the conversation about open science or a longtime supporter of sharing and reusing research data, the FAIR guidelines for making data findable, accessible, interoperable, and reusable establish a basic set of principles for all practitioners who wish to make their research more reproducible. The “how” of doing so varies greatly among fields, modes of research, and investigators’ goals, however, so figuring out the first actions to take to make your research products more FAIR can pose a challenge. A new online tool from the FAIRsFAIR project aims to help researchers think through each FAIR principle and demystify related jargon with FAIR-Aware, a self-guided questionnaire with extensive explanatory guidance for the concrete steps involved in making data FAIR. Continue reading

This information is over 2 years old.
Information was current at time of publication.

COVID-19: HSLS Portals for Data and Molecular Biology Resources

HSLS Data Services and the Molecular Biology Information Service created online portals to help researchers quickly find the information they need to address questions about SARS-CoV-2 and COVID-19.

Spikes in a corona formation on the outside of the virus

The Data Management: COVID-19 Research Data guide includes lists of general and clinical repositories. These linked resources are COVID-19-specific portals for sharing, discovering, reusing, and citing COVID-19 data and code.

The HSLS MolBio COVID-19: resources guide includes categories of linked resources: Trending Research Articles, Research Article Collections, Information Hubs, Molecular Data, and Webinars & Videos. Continue reading

This information is over 2 years old.
Information was current at time of publication.

Open Access COVID Datasets and Software

“Sharing vital information across scientific and medical communities is key to accelerating our ability to respond to the coronavirus pandemic,” said Dr. Cori Bargmann, Head of Science at the Chan Zuckerberg Initiative, regarding a call to action to develop new text and data mining techniques that can help the science community answer high-priority scientific questions related to COVID-19.

Over the past few weeks, two notable resources have been made available, providing open access to COVID datasets and related software:

COVID-19 Open Research Dataset (CORD-19) Continue reading

This information is over 2 years old.
Information was current at time of publication.

Data Journals: Standalone Publication for Replication, Negative, Intermediate, or Simply Noteworthy Data

As the scholarly community continues to recognize the importance of open data sharing for increasing the reproducibility of research, researchers are faced with a growing menu of options through which to make their data available. For example, is it better to deposit data in a digital repository, which often grants depositors a Digital Object Identifier (DOI), or to formally describe a dataset in a data journal article, or to share it through a metadata registry like the Pitt Data Catalog? A recent video call for papers from the journal Data in Brief argues that data journals offer a unique opportunity for standalone publication of genres that are often critically underserved by the scholarly publishing ecosystem: datasets containing replication data, negative results, and intermediate data for research in progress. Continue reading

This information is over 2 years old.
Information was current at time of publication.
This information is over 2 years old.
Information was current at time of publication.

Better Data Sharing in Six Simple Steps

I recently attended a workshop from the Data Curation Network, a collaboration of institutions that have developed specific guidelines to help their researchers share research data. Though the workshop was aimed at librarians, the DCN’s process is useful to any researcher preparing data for sharing in a repository. If you are interested in making your research more reproducible, I encourage you to consider these simple steps.

Imagine that you have a dataset—a package of data files, documentation such as codebooks or READMEs, and perhaps analysis code—that you wish to (or are required to) deposit in a repository such as Figshare or OpenNeuro. The files you have probably require some cleanup before you share them with the world, but there may be other actions you can take that would have a big usability payoff for minimal investment. The steps below form the Data Curation Network’s “CURATE” model, paraphrased here but available in full online: Data Curation Network: A Cross-Institutional Staffing Model for Curating Research Data. Continue reading

This information is over 2 years old.
Information was current at time of publication.

Feedback Request: Draft NIH Policy for Data Management and Sharing

In a follow-up to last year’s request for input on updates to its 2003 Data Sharing Policy, the National Institutes of Health (NIH) is soliciting public feedback on a draft policy for data management and sharing activities related to public access and open science. Regarding the necessity of such a policy, the NIH states:

“Validation and progress in biomedical research—the cornerstone of developing new prevention strategies, treatments, and cures—is dependent on access to scientific data. Sharing scientific data helps validate research results, enables researchers to combine data types to strengthen analyses, facilitates reuse of hard to generate data or data from limited sources, and accelerates ideas for future research inquiries. Central to sharing scientific data is the recognized need to make data as available as possible while ensuring that the privacy and autonomy of research participants are respected, and that confidential/proprietary data are appropriately protected.”

The draft policy would apply to all NIH-funded or conducted research resulting in the generation of scientific data and requires: Continue reading

This information is over 2 years old.
Information was current at time of publication.

New Data Repository Option for NIH Researchers: NIH Figshare

In July 2019, NIH and Figshare announced the one-year pilot launch of a general data repository for all NIH-funded researchers: NIH Figshare. This repository makes datasets resulting from NIH-funded research accessible by providing a way for NIH researchers to meet data sharing requirements of grants, journals, or institutions when a subject-specific repository is not an option. Continue reading

This information is over 2 years old.
Information was current at time of publication.

24/7 Training for Data Analysis and Statistics Software: E-Resources from the Library

Decorative: book to e-book learning conceptIf you write scripts or use data analysis software, did you know that the Health Sciences Library System provides access to thousands of reference materials to help support research programming in the health sciences? If you want to test out software or need help interpreting a never-before-seen error message, the library’s streaming videos and e-books are available to anyone with a Pitt ID, on- or off-campus.

LinkedIn Learning (formerly known as Lynda.com) provides video tutorials, transcripts, and exercises for popular data analysis and statistics software. Need an introduction to SPSS? Try the SPSS Statistics Essential Training course to learn the basics, or focus on quantitative tests in SPSS for Academic Research course. Dive deep into SAS with a multi-part series of SAS Essential Training: Descriptive Analysis for Healthcare Research and SAS Essential Training: Regression Analysis for Healthcare Research. Introductions to Stata and MATLAB are also available. Continue reading

This information is over 2 years old.
Information was current at time of publication.

Tell Us Your Story: Outcomes from Data Sharing

During Love Data Week, HSLS Data Services gathered stories from health sciences researchers to better understand the “benefits or unforeseen outcomes” experienced from data sharing.

The paraphrased stories below illustrate the importance of data security and thoughtful data management.

There is the expectation that one’s identity would remain 100% confidential when participating in a research study. A breach in data security, identified during a Google search, made one research participant hesitant about sharing any personal data in future studies. Continue reading

This information is over 2 years old.
Information was current at time of publication.