Texas Tech University

What is Data Management?

Data Management is a growing area of focus for faculty to consider as they prepare their grants and funding proposals. Many primary funding agencies, including NSF, NIH, NEH, NIJ, NASA, and IMLS, now require that researchers include a data management plan in their proposals. In order to have a well prepared funding proposal it is important to think of some core ideas of data management.

Data management is the process of managing the full lifecycle of information and data, including four areas: discovery, preservation, access, and security.1

Figure 1: Life Cycle of Data

Figure 1: Life Cycle of Data

Discovery ensures that the generated data is accurately described, identified, and located. In order to guarantee discovery, curators need to implement descriptive metadata and administrative metadata to the generated data. Descriptive metadata, such as data title, research background description, date of generation, etc., provides introductory information for the purpose of data usage and reuse. Administrative metadata includes persistent identifiers and data naming structure, which not only facilitate discovery but also help with internal data administration. Discovery enables researchers to perform keyword search and locate wanted data, and enables curators to retrieve and manage it.

Preservation includes the technical aspects of long-term data management and ensures data of continuing value remains accessible and usable. Preservation of research data involves format conversion, system or media migration, and data recovery processes. For better understanding, think about transformation of audio media from reel, phonograph, cassettes, video cassettes, CD and DVD players, and currently mp3 or wav files in computers and smart phones. Ensuring a recorded recital of Beethoven is still playable in the long term future is what data preservation means.

Access is about controlling who can view and/or download which data. When depositing data into a repository system, data curators need to customize the access restrictions based on the requirements of the data copyright holder. Access of data can include open access, embargoed for specific time frames, authorized access to individuals or groups, total restrictions, etc.

Security involves two main aspects, the security of data itself and the privacy of research subjects. Data security usually complies with access and retention. Data curators need to ensure that data is safe from unauthorized changes or access, data corruption, and securing data in the process of back-ups and recovery. Data curators also need to make sure all data is de-identified or anonymized before it is deposited in a repository. Identifying information can be removed from datasets entirely, or coded, or encrypted. IRB approval with the research data is always needed for depositing purpose.

This article gives an outline of what data management is. In the coming weeks TTU Libraries' Data Management Team will be presenting a series of articles in the Scholarly Messenger to help highlight important issues to consider when preparing data management plans.

For more information or questions: libraries.datamanagement@ttu.edu

1, DeeAnn Allison, 2015, "Managing the Life-cycle of Data. Retrieved May 11, 2015 from http://digitalcommons.unl.edu/libraryscience/325/