The Utility of Data Management Plans and Other Written Protocols
By Marianne Evola
Years ago, when I was finishing my dissertation, I set up a meeting with a couple members of my dissertation committee to discuss some difficulties I was having with data and statistics. I brought along my data to the meeting as well as a few published manuscripts to reference standard practices for managing and analyzing data. I presented my plan for the data and then started to discuss my confusion regarding the analyses described in the published literature. I was hoping that my committee could clarify how our colleagues were organizing and analyzing their data. As I asserted my confusion over the published analyses, one of my committee members waved her hand and declared that readers seldom understand how data was actually handled because the methodology for data analyses was always vague and poorly written. Everyone laughed, including me, but that declaration has stuck with me since that meeting. It is interesting that researchers can write so clearly about experimental methodology yet often fail to be clear about handling data. I have always been able to break down experimental methodology into a simple flow chart from published manuscripts. However, it is much more difficult to do the same for data management and analyses. It was at that meeting that I realized that I, as the struggling student, was not the weak link in comprehending strategies for data management and analyses. Rather, the problem was often the published literature.
More recently, weaknesses in writing about data management and analyses as well as weaknesses in actual data management in research labs have been addressed. Anyone who has submitted a grant proposal to the National Science Foundation (NSF) since 2011 is aware that NSF requires a Data Management Plan to be included with research proposals. A similar requirement is in place for many National Institutes of Health (NIH) awards, which require applicants to submit a Data Sharing Plan with their proposal. The goals of these data management plans are to ensure that there is a system in place so that a research team knows how to manage and keep data organized, and also to create a system to archive data so that it is not lost in the case of personnel changes or a grant coming to an end. In addition, data plans also address a progression toward a system that readily supports data sharing and/or data repositories. Archiving data produced through funded research facilitates the availability of data for data mining by appropriate outside parties that have an interest in asking additional questions of an existing dataset. As such, data management plans are expected to include a description of data that will be collected, the format and organization of data, resources for preserving and archiving data, methods that will be utilized to make data available to others, policies for sharing data that include issues of confidentiality and intellectual property, and finally, a list of parties that are responsible for managing data during data collection and after completion of the research project.
As would be expected, many researchers are neither happy about the requirement of a data management plan, nor are they happy that they may be expected to share their data. Regarding a data archive or repository, concerns include the potential for violations of intellectual property, especially in translational research that could lead to successful patents and business ventures. How long does the original researcher hold exclusive rights to the data for publishing and/or filing patents before they are required to share data, and who has the right to release archived data? Considering patents can take years and many labs have a backlog of publishable research, researchers need sufficient time to publish their work before outside parties are given access to archived work. How do you assign credit if outside parties choose to publish work created from archived data? Who is responsible for costs associated with archiving data and reliably maintaining archived data in the presence of evolving technology, and will resources be wasted on archived data that will never be accessed? Also, how much time will be wasted on creating data management plans that may or may not be realized by the research team due to efficient training of research personnel, personnel changes and even changes in the research questions that naturally lead to changes in data management as a project evolves? These are all legitimate concerns.
However, the many concerns often overshadow the utility of defining a data management plan. A data management plan is an effective teaching tool for research teams that constantly address personnel changes or personnel with notably different data management experience. The data management plan should be provided to each new and existing member of the research team so that it can be utilized as a guide on how to handle data and provide an opportunity for personnel to ask for clarification and instruction. Often new members of a team, especially undergraduate assistants, do not have the experience to even formulate questions. The written management plan is an effective first step toward providing them with information and a vocabulary to address their confusion in what can be an intimidating situation. Furthermore, the research team should be included in constructing the data management plan. Researchers at different levels need instruction and feedback on their thoughts and writing, and often may approach a research project from different perspectives. Asking your team to construct a data management plan can provide creative insight into a research project as well as reduce some of the workload that goes into constructing a research proposal. It also gives senior members of your research team opportunities to develop critical management and organization skills if you give them the opportunity to compile the input provided by less experienced personnel.
During my graduate training, students in our lab had notably different projects. We were encouraged to pursue independent projects, master the literature, and create experimental protocols and data management systems that often markedly differed from our mentor’s historical research designs. As such, we were often asked to construct text for inclusion in animal protocols and grant proposals. It was excellent training, and was also inspiring as we watched more and more of our text survive the editing process as we grew into independent researchers. Similarly, I have had students tell me that their mentors have had them construct protocols for the Institutional Animal Care and Use Committee, the Institutional Review Board, and more recently, safety plans for their research environment. To be honest, students are often grumbling and complaining about these tasks and sometimes feel intimidated by their assigned project. When students have raised their concerns, I not only encourage them to embrace the opportunity to design these systems and protocols, but to enact systems to ensure accountability and compliance with protocols once the written protocol is created. This type of management training is a critical part of research training. Furthermore, these are concrete accomplishments that can be included on a CV or resume when it comes time to market their career.
Unfortunately, many labs never even think to share these protocols with their personnel, let alone include personnel in the construction or implementation of protocols. The protocols are in the lab, and personnel generally have access to them, but trainees are seldom encouraged to read or critique them. Furthermore, many institutions also provide a “boilerplate” for construction of management plans and protocols. Understandably these templates are useful tools for faculty who juggle extraordinarily busy schedules. However, following a template or utilizing text provided by the institution can influence insight and creativity as well as impact opportunities to teach and mentor lab trainees. Furthermore, neglecting to share the constructed protocol or omitting team participation in protocol implementation decreases the opportunities for trainees to develop marketable management skills that will also contribute to lab efficiency and protocol compliance.
Similarly, although I agreed that many concerns about data archives or repositories are legitimate, I am also fascinated with a research future that provides access to archives of raw data. In some cases we have historical records of lab notebooks from our brilliant and highly successful research ancestors. However, often, the raw data from our more contemporary colleagues has been lost or destroyed once the research has been published, the research focus has changed, or researchers have changed institutions or retired. It is understandable that boxes and boxes of paper and raw data could not effectively be stored for long durations of time with no perceivable use. I cannot help wondering how much information has been lost from discarded data. Arguably, in the past, digging through mounds of paper for snippets of data would have been laborious and inefficient and thus would probably not have been very cost effective. Thus warehousing cases and cases of paper documents would have been wasteful. I, myself, have been involved in moving and closing a lab, and have had no choice but to discard cabinets full of paper records. Luckily, I live in the modern era, and although I discarded paper, all the files are still held electronically. Similarly, most contemporary data is electronic, and as such, we are entering an era where data can effectively be archived and even efficiently queried to answer questions not conceived or deemed valuable by the original researchers. I find the opportunity to revisit and mine the data of colleagues intriguing and see the possibility of careers being built on the practice of data mining as repositories grow and evolve.
I’ve trained students who are happier and more talented when working on a large data set in front of a computer, as opposed to engaging in data collection in the lab. The future of research may hold more opportunities for careers built on data mining and revisiting old data sets. This could enable researchers to work with their strengths to contribute to scientific discovery. In fact, recently I watched a report on mainstream media that presented notable differences in how males and females respond to drugs. The report was not shocking to me because pharmacological research has been reporting notable sex differences in drug response for a decade or more. However, the news report also mentioned that a large number of studies had included both male and female subjects, yet had neglected to analyze the data for sex differences. Since I was thinking about data repositories for this article, it dawned on me that if that data has been effectively archived for sharing purposes, then someone could probably build a career on mining data to assess differences in the efficacy of existing drug treatment for males versus females. This is just one example of how archived data could be revisited to address new questions while saving the costs and labor associated with experimentation and data collection.
Funding agencies are now requiring data management plans. As such, investigators are creating them for proposal submission. Since they have already been created, I encourage researchers to utilize them to promote discourse and training with their research team. Many faculty members already utilize protocols and plans in this manner. If you have not thought to do so, I encourage spending a couple of lab meetings discussing what is contained in these documents. I also encourage students to embrace any opportunity to contribute to the construction or implementation of any type of protocol or management plan. If your mentor asks you to construct the data management plan or safety plan for your research lab, grab the opportunity to critically assess your research and environment for weaknesses and opportunities for improvement. Your mentor has an exceptional mind and valuable experience. However, your mentor also welcomed you into his/her lab to be a contributing member of the team. As a student, you are there to learn, but as a member of the team, you are expected to contribute. If you are new to a research environment and no one has yet encouraged you to read an animal or human subjects protocol, a safety plan, or a data management plan, ask if there are any protocols that you should review to ensure that you are well informed on lab protocol and procedure. I agree, these documents can be a lot of work to construct, and are additional demands on extraordinarily busy people. However, once constructed, written protocols and plans can be valuable tools to engage your team or your mentor to maximize compliance and productivity. At the very least, since you have been required to write them, you can require others to read your work.
Marianne Evola is senior administrator in the Responsible Research area of the Office of the Vice President for Research. She is a monthly contributor to Scholarly Messenger.
Alice Young, associate vice president for research/research integrity, is a contributing author/editor.