Publication Cover
Accountability in Research
Ethics, Integrity and Policy
Volume 31, 2024 - Issue 6
398
Views
2
CrossRef citations to date
0
Altmetric
Article Commentary

Research data mismanagement – from questionable research practice to research misconduct

ORCID Icon & ORCID Icon
Pages 706-713 | Received 24 Aug 2022, Accepted 07 Dec 2022, Published online: 14 Jan 2023
 

ABSTRACT

Good record keeping practice and research data management underlie responsible research conduct and promote reproducibility of research findings in the sciences. In many cases of research misconduct, inadequate research data management frequently appear as an accompanying finding. Findings of disorganized or otherwise poor data archival or loss of research data are, on their own, not usually considered as indicative of research misconduct. Focusing on the availability of raw/primary data and the replicability of research based on these, we posit that most, if not all, instances of research data mismanagement (RDMM) could be considered a questionable research practice (QRP). Furthermore, instances of RDMM at their worst could indeed be viewed as acts of research misconduct. Here, we analyze with postulated scenarios the contexts and circumstances under which RDMM could be viewed as a significant misrepresentation of research (ie. falsification), or data fabrication. We further explore how RDMM might potentially be adjudicated as research misconduct based on intent and consequences. Defining how RDMM could constitute QRP or research misconduct would aid the formulation of relevant institutional research integrity policies to mitigate undesirable events stemming from RDMM.

Acknowledgement

The authors are grateful for the constructive feedback provided by the reviewers, which helped improve the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1. Different academic entities have different, albeit overlapping, definitions for research data. We cite here two definitions, one by an institution (University College London, UCL) and another by a journal publisher (Springer Nature). UCL: “Data are facts, observations or experiences on which an argument or theory is constructed or tested. Data may be numerical, descriptive, aural or visual. Data may be raw, abstracted or analyzed, experimental or observational. Data include but are not limited to: laboratory notebooks; field notebooks; primary research data (including research data in hardcopy or in computer readable form); questionnaires; audiotapes; videotapes; models; photographs; films; test responses … ” (https://blogs.ucl.ac.uk/rdm/2015/09/what-is-research-data). Springer Nature: Research data refers to the collection of files that support your research project, study or publication such as spreadsheets, documents, images, videos or audio (https://www.springernature.com/gp/authors/research-data).

2. Loss of data, either partial or complete, would potentially affect both reproducibility and replicability. In cases where replication is strictly reliant on the same dataset, it would definitely be affected. Also, the integrity of raw/primary data would be important for replication studies if these are scarce or opportunities for collection of these data are very limited. While gathering another human cohort for replication studies is always possible (funds and resources permitting), getting similar data for a supernova or an extremely rare geological event might be practically impossible.

3. There could of course be an apparent discordance in form between the final presented data and the primary/raw data. The latter could exist in all forms that are not usually familiar or even recognizable by non-experts, such as digital files with symbolized or computing languages, or images that could only be rendered visible or interpretable with specialized technology/software. Even a most common form, spreadsheets, could appear to be nothing more than rows and columns of numbers with little guiding labels. However, we must be able to ascertain that the final presented data are actually derived from sets of primary/raw data that are truly in existence, i.e., properly managed and archived, and that this could be done despite all the manipulations and derivations the researchers have undertaken to allow the data to become presentable to the wider audience in publications.

4. The National Science Foundation (USA) requires a finding of research misconduct to be acts committed either recklessly, knowingly, or intentionally (https://oig.nsf.gov/sites/default/files/document/2021-10/Assessing%20Intent%20in%20RM%20Investigations_4.pdf). We could also draw analogies for “intent” in research misconduct with that in criminology. In the latter, the concept of mens rea or “guilty mind” depicts states of mind that give rise to criminal liability. The American Law Institute’s Model Penal Code cites four mens rea terms, in that criminals could act either “negligently,” “recklessly,” “knowingly” or “purposely,” in escalating levels of criminal intent. As pointed out earlier by Dresser (Citation1993), the Model Penal Code’s culpability provisions could guide the assessment of intent in research misconduct.

5. Such a causal relationship would need to be established by a research misconduct investigation. For example, in cases where there are suspected image manipulations in publications, individual raw image data files generated by the original image capturing software (which would be in vendor software-specific file formats) and bearing the right timestamps, which have contributed to the composite figures in the paper, would need to be found and examined. If these raw data files are missing or otherwise incomplete or unavailable without acceptable explanations, RDMM might have played a role in the image irregularities that could amount to misconduct. Although it is standard for an research misconduct investigation process to secure data files and records, retracing or reconnecting suspicious final data/results to primary/raw data may not always easy, and in many, if not most, cases would require the aid of domain experts.

Additional information

Funding

The authors have no conflict of interest to declare.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 461.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.