916
Views
0
CrossRef citations to date
0
Altmetric
Invited Comment

Partnering With Authors to Enhance Reproducibility at JASA

, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, , ORCID Icon, ORCID Icon, ORCID Icon, , , & ORCID Icon show all
Pages 795-797 | Received 02 Apr 2024, Accepted 03 Apr 2024, Published online: 25 Apr 2024

1 Introduction

In 2016, JASA Applications and Case Studies (ACS) introduced a reproducibility initiative to address the lack of standardized practices for reproducibility in scientific research. This initiative established minimum criteria for the inclusion of code, data, and workflow at JASA ACS, and piloted a new editorial role, Associate Editor of Reproducibility (AER), to implement these standards. This initiative has since expanded from ACS to all original research manuscripts published at JASA, including those submitted to JASA Theory and Methods (TM). Since the inception of the JASA reproducibility initiative, the team of AERs has grown, the process has become more standardized, the role of an AER in guiding reproducibility in statistical research has been refined (Willis and Stodden Citation2020) and, in 2023, we implemented a reproducibility award1 to recognize papers published in JASA with outstanding reproducibility materials. The goals of this editorial are to (a) explain the type of reproducibility that is being addressed during the review process at JASA, (b) describe the steps of this process and its underlying philosophy, and (c) clarify how authors can streamline the review of their reproducibility materials.

The JASA initiative is part of a larger movement in the scientific community to address what is widely considered a “replicability crisis” in science (Begley and Ioannidis Citation2015; Open Science Collaboration Citation2015). However, there are many terms used interchangeably, including reproducibility, replicability, reliability, robustness, and generalizability, with potentially different definitions, depending on the field of research (Goodman, Fanelli, and Ioannidis Citation2016; Gundersen Citation2021). Broadly, these terms address either replicability/generalizability which relates to whether research findings can “be shown in other datasets or populations,” or reproducibility which is defined as the ability of a researcher to “duplicate the results of a prior study using the same materials as were used by the original investigator.”

These definitions derive from the experimental sciences; how they apply to the statistical methods-focused research featured in JASA is less clear but no less critical. Reproducibility in statistics supports the validity of the research presented in a paper and underpins the role of scientific research in advancing knowledge. This role is shared by three key stakeholders—journal, academic reader, and lay public – that benefit from and contribute to high reproducibility standards. First, the reproducibility process fosters a culture of transparency and accountability that is critical to nurture the trust placed by the lay public in scientific research. It is an essential safeguard against the dissemination of misinformation that can damage this trust and jeopardize the positive societal impact of future scientific discoveries. In addition, the academic reader will frequently use published research to motivate future research. Reproducible research creates trustworthy cumulative knowledge on which researchers can confidently build to further advance their field. Finally, the journal gains credibility from a system that holds researchers accountable for their work, credibility that can be leveraged to publish controversial and impactful research that has the potential to change a field of study. As the mediator between researchers and lay public, the journal has substantial control of, and therefore substantial responsibility for, the dissemination of research results. JASA takes an active mediating role in upholding high reproducibility standards.

In this editorial, we seek to refine the definition of reproducibility in statistical research, which we refer to as “methods reproducibility” (Cacioppo et al. Citation2015). Methods reproducibility in statistics involves providing materials with two key elements: (a) they allow for the numbers in the paper to be directly reproduced, and (b) they are user-friendly enough for future readers to easily use and build on the proposed technique. This second piece is the most critical element of methods reproducibility and what we strive to uphold as AERs at JASA; we leave the assessment of paper quality and mathematical accuracy to the traditional review process. Below we describe the reproducibility effort at JASA in more detail.

2 Reproducibility Effort at JASA

To clarify our mission, we introduce the framework of “partners in” versus “officers of” reproducibility. Reproducibility “officers” would police the materials published by the author to ensure that all manuscript numbers and tables are precisely recapitulated. While this is a noble and worthwhile goal, at JASA we view our role as reproducibility “partners” who work with and guide the authors in providing resources that enable readers who are fellow statisticians to test and build on their work. To this end, and to ensure a baseline minimum set of standardized materials are provided by the authors of each JASA paper, we have developed a reproducibility assessment process centered around the Author Contributions Checklist (ACC) form. The ACC form is the central document of the reproducibility effort at JASA. The purpose of the ACC form, as stated in its associated GitHub repository,2 is to “document the artifacts associated with a manuscript (i.e., the data and code supporting the computational findings) and describe how to reproduce the findings.” Its target audience is future readers of an accepted manuscript and it is intended to be phrased in the present tense. To help demystify the review process, we describe the workflow of a “prototypical” reproducibility reviewer:

  1. Read through the authors’ ACC form for completeness and clarity.

  2. Open any README documents provided and evaluate if they are helpful in understanding the workflow of the attached code. READMEs are highly encouraged.

  3. Look through the directory structure of the provided materials to understand the organization of these materials.

  4. Open and read through attached code files to ensure that code is sufficiently documented for the use and understanding of future readers.

  5. Depending on judgment of the AER, run examples or a few key pieces of code.

  6. Write a review that comments on any issues found in the steps above with the intention of helping authors strengthen materials for future readers.

Next we expand on approaches authors can take to streamline their own reproducibility materials, and we discuss common misconceptions about the AER review process.

3 Author Dos and Don’ts

We next offer suggestions on how to preemptively prepare materials that address common feedback provided in reproducibility reviews, shown in . The reproducibility review strives to be specific and provide concrete action items to strengthen the reproducibility of the research. By following these guidelines, authors may avoid the step of having to undergo multiple rounds of reproducibility revisions.

Table 1 Dos and don’ts based on common reasons reproducibility revisions are requested.

4 FAQs about the JASA Reproducibility Process

There are many common questions and potential misconceptions about the reproducibility review process; we attempt to address them here.

Q: Do the AERs run all the code and check that all figures and tables are numerically reproducible? If you don’t run all the code, how are you even assessing reproducibility?

A: As a rule, AERs do not run all code provided by the authors. Our role is to ensure that the ACC form and code are sufficiently documented such that future readers of the paper can implement and build on results. Further, we frequently review papers with large datasets and computationally demanding methods that cannot be easily reproduced numerically but still have a rightful place in JASA; we do not want to discourage authors from submitting such manuscripts or hold more computationally tractable methods to a higher bar. As partners in reproducibility with the authors, we trust that they have put in a good faith effort to ensure that the materials they provide will produce results consistent with their published work.

Q: Making my code reproducible will take substantial effort and time. Do I really have to provide reproducibility materials?

A: For reasons previously described in Section 1, reproducibility plays a crucial role in published research. While time-consuming, preparing reproducibility materials can be greatly facilitated by doing so throughout the research process rather than at the time of submission. Authors should be encouraged to carefully prepare reproducibility materials by the prospect of publication in top-tier journals such as JASA and increased likelihood that others will cite their work if the method is easy to reproduce.

Q: Will my paper be rejected based on the reproducibility review?

A: No, papers at JASA will not be rejected based on the reproducibility materials. However, if the paper has been accepted but the authors do not comply with the AER’s requests, the paper will be returned to the authors for additional revisions.

Q: Do I fill out the ACC form when I initially submit my paper to JASA?

A: No, the reproducibility review process starts after a paper has already gone through one round of review and has been flagged for major or minor revisions. When revising and resubmitting their paper in this second round, authors must provide reproducibility materials.

Q: If I have a GitHub repository with my code/software package, do I still need to fill out the ACC form?

A: Yes. GitHub repositories and software packages are useful supplements to a paper. However, it is still important to fill out the ACC form to ensure standard materials are included and to explain the workflow. In addition, while software packages can help others use the method, we ask that authors also contribute code from their manuscript typically not included in a software package; this includes data processing code, code for running simulations, and code for reproducing tables and figures.

Q: If I provide a link to a GitHub repository, doesn’t that unblind the process?

A: Blinding is intended to keep bias out of the acceptance and rejection decision. As AERs do not make decisions to reject or accept papers, an unblinded GitHub repository does not impact the reproducibility review process. Authors may alternatively choose to anonymize their GitHub repository.3

Q: How can I strive to make my materials good enough to be considered for the reproducibility award?

A: Authors should place themselves in a reader’s shoes by imagining what they would need when attempting to reproduce a paper’s results or use the paper’s methods. Several suggestions are also made above in Section 3. Authors may further consult the submitted materials in Balocchi et al. (Citation2023) and Gao, Bien, and Witten (Citation2022), the winners of our 2023 JASA reproducibility award.

Q: I have a complicated, large dataset with code that is too computationally intensive to run on a standard laptop. Can I still be considered for the reproducibility award?

A: Yes. All papers accepted by JASA TM or JASA ACS in the previous calendar year will be considered for the reproducibility award.

5 Conclusion

The primary objective of the reproducibility initiative at JASA is to promote methods reproducibility in statistical research, with emphasis on the role of the AER as a partner guiding authors to provide materials that support future methods development rather than an officer policing the code. While we present in this article our current best efforts at guiding this type of reproducibility, we acknowledge that methods reproducibility is not a static target and our process will likely change and improve over time.

Though we focus on our efforts at JASA specifically, we believe that a wider discussion of methods reproducibility, including adoption of common standards, by editors across statistical journals would benefit the field and hope that by describing our process we can help facilitate such fruitful discussions. In addition, we welcome the adoption of our ACC form as a template or starting point for other statistics and data science journals initiating their own reproducibility review process.

References

  • Balocchi, C., Deshpande, S. K., George, E. I., and Jensen, S. T. (2023), “Crime in Philadelphia: Bayesian Clustering with Particle Optimization,” Journal of the American Statistical Association, 118, 818–829. DOI: 10.1080/01621459.2022.2156348.
  • Begley, C. G., and Ioannidis, J. P. (2015), “Reproducibility in Science: Improving the Standard for Basic and Preclinical Research,” Circulation Research, 116, 116–126. DOI: 10.1161/CIRCRESAHA.114.303819.
  • Cacioppo, J. T., Kaplan, R. M., Krosnick, J. A., Olds, J. L., and Dean, H. (2015), “Social, Behavioral, and Economic Sciences Perspectives on Robust and Reliable Science,” Report of the Subcommittee on Replicability in Science Advisory Committee to the National Science Foundation Directorate for Social, Behavioral, and Economic Sciences, 1.
  • Gao, L. L., Bien, J., and Witten, D. (2022), “Selective Inference for Hierarchical Clustering,” Journal of the American Statistical Association, 119, 332–342. DOI: 10.1080/01621459.2022.2116331.
  • Goodman, S. N., Fanelli, D., and Ioannidis, J. P. (2016), “What Does Research Reproducibility Mean?,” Science Translational Medicine, 8, 341ps12. DOI: 10.1126/scitranslmed.aaf5027.
  • Gundersen, O. E. (2021), “The Fundamental Principles of Reproducibility,” Philosophical Transactions of the Royal Society A, 379, 20200210.
  • Open Science Collaboration. (2015), “Estimating the Reproducibility of Psychological Science,” Science, 349, aac4716. DOI: 10.1126/science.aac4716.
  • Willis, C., and Stodden, V. (2020), “Trust but Verify: How to Leverage Policies, Workflows, and Infrastructure to Ensure Computational Reproducibility in Publication,” Harvard Data Science Review, 2. Available at https://hdsr.mitpress.mit.edu/pub/f0obb31j. DOI: 10.1162/99608f92.25982dcf.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.