Guiding Principles for Data Management: Is Your Data FAIR?

For the past several years, researchers, funders, publishers, software developers, institutions, and other research stakeholders have been discussing methods for data-sharing and data stewardship on a grand scale, recognizing the need for minimal principles and practices. The FAIR data principles were first formalized in 2014 at a workshop in Leiden, The Netherlands, and are available for comment at the website of Force11.

“FAIR” is an acronym representing data as (1) Findable (2) Accessible (3) Interoperable (4) Re-usable. The four FAIR principles add efficiency and value to research data when it is ready for journal submission with its associated manuscript.

  1. Findable
    • Data should have a unique and persistent identifier at all times;
    • The unique and persistent identifier locates the dataset in a digital space;
    • Data should be distinguished from all other data via metadata;
    • Identifiers for any concept used in a dataset should also be unique and persistent.
  2. Accessible
    • Access can be always obtained by machines and humans with appropriate authorization;
    • Access can be always obtained by machines and humans through an open, free, well-defined protocol;
    • Machines and humans alike can access metadata, even if the data object itself is not available.
  3. Interoperable
    • If metadata is machine-readable, the data object is interoperable;
    • If metadata formats use shared vocabularies, the data object is interoperable.
  4. Re-usable
    • Data objects should be compliant with the first three principles to be re-usable;
    • Metadata should include a clear data usage license permitting reuse;
    • Documentation of software, code, and similar files must be included for accurate reuse;
    • Data objects must be clearly associated with their source (provenance) for proper citation.

With the FAIR Principles, there are now methods to evaluate both data and data repositories:

  • The FAIR Principles provide a method for self-assessment of basic dataset interoperability and usability.
  • The Data Seal of Approval is granted by an international organization to data repositories that meet quality standards via self-assessment.

For data related questions, contact a member of the HSLS Data Management Group.

~Andrea M. Ketchum