5 Simple Ways to Love Your Data

Love Data Week. Show your data some love. Let us show you how.February 10–14 is Love Data Week (#lovedata20). To celebrate, the HSLS Data Services team has 5 simple tips for showing your data love through data management best practices. Contact us with any data related questions.

1. Organize your data considering the “80/20 rule.”

Typically, 20% of files are used 80% of the time. Give forethought on how to organize your files. The files you access most routinely should not be buried 10 clicks down in a folder structure. The following examples illustrate two folder structures for the same data.

  • Heart_Regeneration
    • Rat1
      • Camera_Images
      • Pressure
      • Slides
        • Epifluorescent
        • Confocal
  • Heart_Regeneration
    • Camera_Images
      • Rat1
    • Pressure
      • Rat1
    • Slides
      • Rat1
        • Epifluorescent
        • Confocal

2. Create and consistently use a file naming convention.

A file naming convention is a framework for naming your files in a way that describes what they contain and how they relate to one another. One best practice for creating a file naming structure includes using the ISO format (shown below) so that there is no ambiguity for dates. For example, 1262011 could be interpreted three ways: 12 June, 2011; December 6, 2011; or January 26, 2011.

YYYYMMDD or YYYY-MM-DD

(Attend the HSLS coffee break in March to learn more file naming best practices.)

3. Document, document, document!

Create a README file in the root folder of your dataset. README files are documents saved in plain text (.txt) or markdown (.md) format. Some recommended details to include are:

    • contact information for the data creators
    • date created
    • licenses or restrictions placed on the data
    • data dictionary
    • description of methodology (with links or references to publications or other documentation containing experimental design or protocols used)

4. Save your files in a non-proprietary (open) format when possible.

The format in which your files are saved influences the ability for them to be opened in the future. Non-proprietary, or open, formats (i.e. .csv) are more inter-operable and allow for long-term preservation and potential reuse. If your data cannot be saved in an open format: include the required software name, version, and parent company in your README file documentation.

5. Back up your data using the “rule of 3.”

Back-ups can protect against accidental or malicious data loss. The “rule of 3” is usually suggested for backing up your data:

    • Two onsite copies; but physically separate (i.e. NOT an external hard drive sitting on top of your desktop computer) AND
    • One offsite copy (i.e. remote server or cloud)

~Melissa Ratajeski