February 10–14 is Love Data Week (#lovedata20). To celebrate, the HSLS Data Services team has 5 simple tips for showing your data love through data management best practices. Contact us with any data related questions.
1. Organize your data considering the “80/20 rule.”
Typically, 20% of files are used 80% of the time. Give forethought on how to organize your files. The files you access most routinely should not be buried 10 clicks down in a folder structure. The following examples illustrate two folder structures for the same data.
2. Create and consistently use a file naming convention.
A file naming convention is a framework for naming your files in a way that describes what they contain and how they relate to one another. One best practice for creating a file naming structure includes using the ISO format (shown below) so that there is no ambiguity for dates. For example, 1262011 could be interpreted three ways: 12 June, 2011; December 6, 2011; or January 26, 2011.
YYYYMMDD or YYYY-MM-DD
(Attend the HSLS coffee break in March to learn more file naming best practices.)
3. Document, document, document!
Create a README file in the root folder of your dataset. README files are documents saved in plain text (.txt) or markdown (.md) format. Some recommended details to include are:
- contact information for the data creators
- date created
- licenses or restrictions placed on the data
- data dictionary
- description of methodology (with links or references to publications or other documentation containing experimental design or protocols used)
4. Save your files in a non-proprietary (open) format when possible.
The format in which your files are saved influences the ability for them to be opened in the future. Non-proprietary, or open, formats (i.e. .csv) are more inter-operable and allow for long-term preservation and potential reuse. If your data cannot be saved in an open format: include the required software name, version, and parent company in your README file documentation.
5. Back up your data using the “rule of 3.”
Back-ups can protect against accidental or malicious data loss. The “rule of 3” is usually suggested for backing up your data:
- Two onsite copies; but physically separate (i.e. NOT an external hard drive sitting on top of your desktop computer) AND
- One offsite copy (i.e. remote server or cloud)