Introducing the Pitt Data Catalog for Dataset Sharing and Discovery

Pitt Data Catalog, a project by the Health Sciences Library SystemSharing research data can bring many benefits, including greater visibility for data creators, a more transparent research process, and opportunities to identify potential collaborators. But what about datasets that are stored on a lab server instead of in a data repository, or that should only be shared with vetted researchers? The Pitt Data Catalog is a new platform at HSLS designed to help Pitt health sciences researchers share and discover their otherwise hard-to-find datasets, while keeping ultimate control over the data in researchers’ hands.

“The Pitt data catalog has the potential to improve research collaborations and accelerate the impact of research being conducted in the schools of the health sciences. I strongly encourage each researcher to work with HSLS to make your datasets discoverable through the catalog in accordance with the FAIR Data Principles: Making Data Findable, Accessible, Interoperable and Reusable.” Dr. Arthur Levine, Senior Vice Chancellor for the Health Sciences

Unlike data repositories like Dryad or Zenodo, the Pitt Data Catalog does not host any data files. Instead, each dataset included in the catalog is described in a metadata record that includes information about the dataset’s authors, subject domain, and data creation process, as well as instructions for accessing the dataset itself and links to associated publications. Some data catalog entries describe publicly-available datasets, so their records link directly to the data in a repository. Other entries that describe privately-held datasets may direct a visitor to e-mail the corresponding author, or link to a data-access application form. Each record is created in collaboration with the researcher to ensure accurate and comprehensive information.

If you have datasets you would like to have described in the Pitt Data Catalog, please contact the HSLS Data Services team at HSLSDATA@pitt.edu or through our dataset inclusion form. We’ll schedule an in-person or phone consultation to learn more about your datasets and discuss the most appropriate terminology to describe your data. After we create a draft of your dataset’s record, we’ll send it to you for final approval. If you have updates after the record is published, just contact us to make changes; we may also contact you to make sure our information is still current.

HSLS Data Services staff are happy to give demonstrations for individual health sciences researchers, departments, or labs. If you would like to investigate whether the Pitt Data Catalog would be a good match for your datasets, please reach out and we will gladly explore its possibilities with you.

The University of Pittsburgh, Health Sciences Library System, is a member of the Data Catalog Collaboration Project and has customized this data discovery tool in part with Federal funds from the National Library of Medicine, National Institutes of Health, Department of Health and Human Services, under cooperative agreement number UG4LM012342 with the University of Pittsburgh, Health Sciences Library System. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

~Helenmary Sheridan

Expand Your Data Analysis Universe with Galaxy

Galaxy logoThe life sciences are erupting with data. Thanks to advancements in DNA sequencing technologies and the speed and capacity of computational algorithms, the generation of vast quantities of genomic and proteomic data is now commonplace and expected. However, analysis of this data is not keeping pace with its acquisition (storage space is yet another issue…). One limiting factor is that many biomedical scientists do not yet know how to access, much less use, the available analytical resources. This article describes a platform for multi-omic data analysis that is accessible, reproducible, and transparent, and recommends resources on how to use it.

Galaxy is a community-supported platform that provides access to over 5,500 tools for a multitude of analytical needs, in categories such as variant analysis, imaging, and statistics. Its components include the Galaxy Software Framework and the Public Galaxy Service. The software framework is an open-source, web-based application that functions as an intermediary between researchers without informatics expertise and the computational infrastructure that runs and stores the analyses. The public service includes the main instance, which is an installation of the Galaxy software combined with many tools and data, as well as over 80 public servers. Some of these servers are even domain-specific (ImmPort Galaxy, focusing on flow cytometry analysis) or tool-publishing (MBAC Metabiome Server, simplifying the control, usage, access, and analysis of microbiome, metabalome, and immunome data). Local institutional instances are also possible; the University of Pittsburgh has a Galaxy server hosted by the Center for Research Computing.

The scale of Galaxy is initially a bit daunting. Fortunately, there are numerous resources to help researchers navigate the analytical possibilities. Everything to get you started is at galaxyproject.org, including Galaxy 101, dataset collections, interactive tours, and a growing collection of tutorials developed and maintained by the worldwide Galaxy community and Galaxy Training Network.

The HSLS Molecular Biology Information Service can also assist you with using Galaxy for your research. During the spring 2018 semester we are introducing two hands-on workshops that will teach the basics of Galaxy including (1) interface navigation and interaction and (2) how to create, modify, and extract workflows.

To learn more, read the bioRxiv article on “Community-Driven Data Analysis Training for Biology” or contact the HSLS Molecular Biology Information Service.

~Carrie Iwema

Study Spaces at Falk Library

Need a place to study? HSLS provides a variety of study spaces for individuals and groups.

Individual Study

Study desks with privacy walls are available in two locations:

  • A newly renovated area on the main floor contains 44 study desks. Each station has power outlet access.
  • The upper floor has seating for 20 individuals across from Classroom 2.

Standing and stools:

  • Counters and standing desks are available on the main and upper floors, both providing easy access to power outlets.

Group Study

  • The recently opened study room on the main floor has 11 tables with 3-4 chairs each. Larger groups can use the tables pushed together.
  • Additional larger tables are interspersed with the computer stations on the main and upper floors.
  • The upper floor Study Lounge is located on the far end of the upper floor.

Comfortable Seating

Armchairs are available on the main floor near the leisure reading collection and on the upper floor in the Study Lounge.

 
Study Areas Reserved for the Schools of the Health Sciences

Group Study Rooms can be used for individual or group study for four hours at a time. Rooms must be reserved online. Keys to access the room are checked out at the Technology Help Desk with a valid Pitt ID.

The main floor room with individual study carrels will soon feature access for Health Sciences students via their Pitt ID.

~Julia Dahm

The Class That Wouldn’t Die

HSLS has a robust portfolio of classes available. We offer classes on core topics such as searching PubMed and using EndNote software, with more specialized classes added based on our expertise, changes in the information environment, and patron needs.

Most class additions are initiated by HSLS research and instruction librarians. But sometimes a new class is born from patron requests. As an example, here is a brief history of one class, “Searching for Dollars: Grant Seeking to Support Research.” Continue reading

Thieme E-Book Library Name Change

The Thieme E-Book Library has been renamed, and is now known as MedOne Education. HSLS provides direct access to this collection on the Databases A-Z list, and the access notes for the e-books will continue to show as being available via Thieme. This will have no effect on Thieme e-journals as their name remains unchanged. There is a freely available MedOne app that allows users the opportunity to access subscribed content on both Android and iOS devices.

~Misti Kane

HSLS Staff News

The HSLS Staff News section includes recent HSLS presentations, publications, staff changes, staff promotions, degrees earned, etc.

Publication

Author name in bold is HSLS-affiliated

Jonathon Erlen, history of medicine librarian, along with co-author Megan Conway, published Disability Studies: Disabilities Abstracts” in The Review of Disability Studies: An International Journal, 14(1), 2018.

Classes for April 2018

FlashClasses

Painless PubMed*, Friday, April 6, 8-9 a.m.

EndNote Basics, Tuesday, April 10, 10 a.m.-12 p.m.

Painless PubMed*, Monday, April 16, 12-1 p.m.

Molecular Biology Information Service

Can Learning Be Fun? (How-To Talks by Postdocs), Thursday April 5, 12-1 p.m.

The Application of Mass Spectrometry in Biomedical Research (How-To Talks by Postdocs), Tuesday, April 10, 1-2 p.m.

ChIP-Seq & CLC Genomics, Wednesday, April 11, 1-4 p.m.

ChIP-Seq & Galaxy, Friday, April 13, 1-4 p.m.

Gene Regulation, Wednesday, April 18, 1-4 p.m.

Gemstones from the Mud: Technical overview of protein purification (How-To Talks by Postdocs), Thursday, April 19, 1-2 p.m.

RNA-Seq & CLC Genomics, Wednesday, April 25, 1-4 p.m.

How to Create a Quality Scientific Poster (How-To Talks by Postdocs), Thursday, April 26, 1-2 p.m.

RNA-Seq & Galaxy, Friday, April 27, 1-4 p.m.

Continue reading