Texas Tech University

"The Dog Ate My Data" and Other Excuses Commonly Given for Alleged Misconduct

By Marianne Evola

My last responsible research contribution to the Scholarly Messenger addressed common excuses that were given for plagiarism, That list originated from National Science Foundation's Inspector General, Allison Learner when she gave a presentation at the Quest for Research Excellence Conference. At about the same time that Allison Lerner's list was made public, there was an editorial in the journal Oncogene that addressed a list of common excuses given by researchers, when the journal editors were required to confront them regarding research misconduct allegations associated with published research. I thought that it would be beneficial to extend the discussion of common excuses provided for misconduct and discuss why these excuses seldom excuse bad research and publication choices.

1) "Nothing to see here. Move along."

As the editorial authors, Stebbing and Sanders, describe, these respondents simply deny any wrongdoing, even when faced with definitive evidence of troubling data and/or figures in their publication. This type of blind denial is not only insulting to anyone that detects problems with their data, it is contrary to the data driven nature of our entire research culture. We were taught, and we teach our students, that "data are the data" and/or the truth is in the data. Thus, if the data is suggesting that there are problems with the publication, then it is likely that there are problems with the publication. When authors deny that there is any problem with a publication when they are faced with overwhelming evidence, it is synonymous to telling their audience that the data is irrelevant.

2) "My dog ate the data."

When I was a first-year graduate teaching assistant (i.e., GTA), I taught one of five lab sections under a supervising lecture professor, similar to many graduate students. As part of the course projects, each of the lab sections would conduct experiments and then the GTAs would compile the data from all of the sections into a single dataset and analyze the results for the class. I happened to be the GTA that was responsible for compiling the first dataset. As such, the undergrads conducted their experiment and I collected the datasheets from the other GTA's. That evening, I entered the data into a spreadsheet and analyzed/graphed the results. The next day, we met with the professor to go over the results. Almost immediately, he detected an issue. I don't remember what the problem was, but he asked to see the data sheet from a particular student. I looked at him in a panic and he immediately recognized the panic on my face. He grumbled, "You threw away the datasheets after you entered the data." I nodded. He then looked at each of his GTAs, especially me, and clearly stated, "You NEVER throw away raw data, ever!" That was the first and only lesson that I ever needed on data retention.

As Stebbing and Sanders state, there are many legitimate reasons that authors have not retained original research data when faced with an allegation of misconduct. Historically, it was very costly to store cases and cases of paper documents containing raw data. It was even more costly to move all of those cases of raw data across the country, if/when an investigator moved to a new institution. So, most researchers eventually had to part with their raw data.

However, as technology evolves, some of those legitimate reasons for discarding data are disappearing. As data storage is increasingly electronic, and as data storage and archiving requirements evolve, it is easier and less expensive for investigators to retain experimental data indefinitely. Increasingly, research sponsors are requiring that proposed investigators have a plan to archive data for long-term storage and accessible sharing with colleagues. That being said, although much of the technology is new, data retention is not a new expectation of the research world. After all, I was taught to do so as a new graduate student. And no, I will not disclose exactly how many years have passed since that lesson. Suffice it to say, it has been more than a few.

3) "If you look hard enough, you can find a trivial difference between two supposedly duplicated images."

Stebbing and Sanders make the strong argument that it is not a matter of rationalizing whether two images have slight differences. Rather, it is more important to assess the probability of obtaining two images from distinct experiments that could be so similar. I agree. Actually, if you consider what is currently termed the reproducibility crisis that concerns so much of our research community (https://www.scientificamerican.com/video/is-there-a-reproducibility-crisis-in-science/ ) it is surprising that someone would make the argument that the reader is overlooking slight differences because they are not effectively scrutinizing the data.
I conducted experiments in animal models of drug abuse and I can tell you a lot about the suffering that comes with variability in data, even for well established drug effects. Establishing consistent behavioral control, before conducting fun original experiments, is a standard in my discipline. We spend a lot of time and effort on creating reproducible, reliable baselines. Yet, even with a strong focus on establishing reliable controls, data variability is the norm rather than the exception. And, frustratingly, there is always at least one inexplicable data point throwing off your analyses. Variability is inherent in our discipline because we work with animals, and animals do inexplicable things. And as for my colleagues that work with the human animal, their research participants are even more unpredictable than my 4-legged subjects. Because of this, when researchers in my discipline see flawless data with little variability, they question how the data could be that clean.

After years of talking to students and faculty about responsible research and data variability, I am certain of one thing, the battle against variability is not restricted to my discipline but is standard for most disciplines. Almost every audience that I have addressed understand the suffering that comes with data variability. As such, I feel that most researchers would agree with Stebbing and Sanders when they assert that an allegation of image duplication should be addressed by assessing the likelihood that two experiments could produce such similar results.

4)"It was the fault of a junior researcher."
5) "The responsible researcher is from another country and unfamiliar with the standards expected in scientific publications."

Excuses #4 and #5 involve blaming a less experienced, less knowledgeable collaborator. There is a huge problem with utilizing this type of excuse. This excuse suggests that the senior scientist on the project does not realize or is not willing to accept the enormous responsibility that comes with authorship and collaboration. When I speak to students about authorship, I remind them that authorship is not merely a reward for good work. Authorship is a tremendous responsibility. When you agree to be an author, you are agreeing to take responsibility for the entire contents of a manuscript. As such, it is critical for all authors to do their due diligence to ensure that the contents of a manuscript are true and accurate representations of the data. This includes reviewing the work and data contributed by collaborators.

There was a great discussion in "Retraction Watch" about the responsibilities of authorship and collaboration contributed by CK Gunsalus a couple years ago. The discussion was a strong reminder that reviewing the work of our collaborators should be standard practice in all collaborations and as such, should not be perceived to be invasive nor offensive but rather, expected. If all collaborators did their due diligence with regard to ensuring that the data and text of a manuscript was collected, analyzed and written with integrity, there would be no perceived offense associated with requesting access to a collaborators raw data.

It is the responsibility of senior scientists to train and supervise junior colleagues. As such, blaming a junior collaborator for research misconduct detected in a publication, on which you are both authors, is merely a confession that you do not take the responsibilities of authorship seriously.

6) "It was only a control experiment."

If you read my response to #3, you are already aware of how important reliable controls are to my discipline. We spend weeks and sometime months demonstrating that an effect is stable and reliable before we move on to the fun and innovative experiments. As such, it makes me shudder a bit to read that a published researcher would be so dismissive of control data. Basically, if you do not have reliable control data then any experimental changes to your data are meaningless. Students in our discipline spend weeks/months establishing reliable control data and their control data is a source of pride that they want to share it with the world as if it were a brand new discovery. This is especially true if they have successfully replicated results that are predicted by the published literature. I have seen many graduate students proudly display and describe a control dose-response curve for morphine, one of our discipline standards in drug abuse research.
Again, I don't think that this type of high expectations for experimental control is unique to my discipline. Rather, all disciplines have standards for experimental control. It is stunning that any researcher would dismiss control data as unimportant. Like Stebbing and Sanders assert, if control data is so meaningless, then why would researchers conduct the control experiments at all, and why would they include control data in the publication. Furthermore, if readers cannot trust that authors are honestly reporting their control data, then how can the readers trust that they are honestly reporting their experimental data.

7) "The results have been replicated by ourselves or others, so the image manipulation is irrelevant."

Again, if you read my response to #3, you will know how important replicating work is to my discipline. And also, I do not feel that producing reliable data is merely a concern of my discipline, especially since there is an assertion that science is facing a reproducibility crisis.

Replication is, in fact, so important to my discipline that I can easily visualize a standard graph of morphine and how the shape of that curve will differ from drugs of other potencies and classes. I know how the data should look because I have looked at graphical representations of replicated experiments again and again in the published literature. Yet each time a colleague replicates the work, they are again expected to publish their replicated results.

Scholars assert that the literature is self-correcting. Basically, we have learned that with enough replication, the truth will arise from data variability. However, if our colleagues are not providing the research community with their actual replicated data, and honestly reporting whether or not their data was representative of previously published data, then the literature cannot self-correct. As a result, we will not be able to trust the literature. Furthermore, like stated above, if your readers cannot trust that you honestly provide your replicated results in publication, how can they trust your original experimental data.

8) "Someone is out to get me." Once again, I agree with Stebbing and Sanders when they state, "Perhaps true, but irrelevant." When I speak with students about responsible research, I inform them that because of the competitive and sometimes adversarial nature of academics, researchers and administrators often make enemies. Sometimes those enemies choose to follow your career very closely so that they can utilize any perceived misstep as ammunition to damage your reputation and career. Thus, it is critical that students check and double check their work before submitting for publication. Utilizing simple tools, such as plagiarism detection software or pixel analyses, to check the work of all contributing authors for accidental plagiarism or image manipulation, respectively, should be standard practice before submitting a manuscript for publication. Utilizing these simple tools can save your career. It is, in fact, the responsibility of all authors to ensure the integrity of the data and the text before publishing. An inherent part of building a productive research career involves disagreement with colleagues. These disagreements can trigger the creation of academic competitors and/or enemies. Luckily protecting the integrity of your work and publications is entirely under your control. But more importantly, it is entirely your responsibility.

If we cannot trust that the literature will self-correct and we cannot trust that our colleagues honestly report their data, we cannot build on the work of our colleagues. If we cannot build on the work of our colleagues, scientific progress will stall. It is essential that all researchers honestly publish their work. Data is data and there is not lesser data. Control data, replicated experiments and original research are equally important for revealing scientific truth. Your audience needs to feel comfortable that all published contributions are conducted and shared with integrity so that they can trust that their future work will be built on a strong foundation.



Marianne Evola is the director of the Human Research Protection Program in the Office of Research & Innovation. She is a monthly contributor to Scholarly Messenger.

For more articles about Responsible Conduct of Research (RCR), click here >>