Featured Workshop: Exploring and Cleaning Data with OpenRefine

HSLS offers classes in a wide array of subjects—molecular biology, database searching, bibliographic management, and more! You can quickly view all Upcoming Classes and Events or sign up to receive the weekly Upcoming HSLS Classes and Workshops email.

This month’s featured workshop is Exploring and Cleaning Data with OpenRefine. The workshop will take place on Friday, June 11, 2021, from 10-11:30 a.m.

Register for this virtual workshop*

Exploring and Cleaning Data with OpenRefine is a workshop that introduces participants to the basics of working with OpenRefine to clean, organize, and transform messy datasets.

OpenRefine (formerly Google Refine) is a powerful, free, open-source tool for working with unorganized tabular data. Since OpenRefine works offline in a web browser, your private data is not uploaded to the cloud and will stay on your local computer. Note that you are always working on a copy of your data, your raw data files are kept in their original form. Another benefit of OpenRefine is that while the program has a graphical interface, the system documents steps that have been completed to allow for reproducibility in data cleaning. These steps can be saved as JSON scripts and used to automate steps to clean other similar files.

Attendees of this workshop will learn how to use OpenRefine to create a new project; explore the data through sorting, filtering, and faceting functions; complete basic data cleaning such as splitting or combining cells and clustering to find and fix inconsistent data entries; and create JSON scripts. This workshop is perfect for researchers and those who work with tabular datasets. Participants are encouraged to review the system requirements, download and install the latest version of the software (before class), and follow along with the class examples. Not familiar with OpenRefine? No previous experience is required to attend this hands-on workshop.

Interested in learning more about data management? HSLS Data Services has a wide array of classes, workshops, and customized trainings. If you’re unable to attend at the scheduled times, request a customized session of any data management class or other HSLS workshop for your course, group, or department.

Contact a librarian from HSLS Data Services to find out more.

* HSLS classes are open to University of Pittsburgh faculty, staff, and students, as well as UPMC residents and fellows. A valid email address is required to register.

~Marissa Spade