For the past several years, researchers, funders, publishers, software developers, institutions, and other research stakeholders have been discussing methods for data-sharing and data stewardship on a grand scale, recognizing the need for minimal principles and practices. The FAIR data principles were first formalized in 2014 at a workshop in Leiden, The Netherlands, and are available for comment at the website of Force11.
“FAIR” is an acronym representing data as (1) Findable (2) Accessible (3) Interoperable (4) Re-usable. The four FAIR principles add efficiency and value to research data when it is ready for journal submission with its associated manuscript.
- Findable
- Data should have a unique and persistent identifier at all times;
- The unique and persistent identifier locates the dataset in a digital space;
- Data should be distinguished from all other data via metadata;
- Identifiers for any concept used in a dataset should also be unique and persistent.
- Accessible
- Access can be always obtained by machines and humans with appropriate authorization;
- Access can be always obtained by machines and humans through an open, free, well-defined protocol;
- Machines and humans alike can access metadata, even if the data object itself is not available.
- Interoperable
- If metadata is machine-readable, the data object is interoperable;
- If metadata formats use shared vocabularies, the data object is interoperable.
- Re-usable
- Data objects should be compliant with the first three principles to be re-usable;
- Metadata should include a clear data usage license permitting reuse;
- Documentation of software, code, and similar files must be included for accurate reuse;
- Data objects must be clearly associated with their source (provenance) for proper citation.
With the FAIR Principles, there are now methods to evaluate both data and data repositories:
- The FAIR Principles provide a method for self-assessment of basic dataset interoperability and usability.
- The Data Seal of Approval is granted by an international organization to data repositories that meet quality standards via self-assessment.
For data related questions, contact a member of the HSLS Data Management Group.
~Andrea M. Ketchum