CrossRef citations to date

Approachable Case Studies Support Learning and Reproducibility in Data Science: An Example from Evolutionary Biology

ORCID Icon &
Pages 304-310 | Published online: 22 Sep 2022


  • Baker, M. (2016), “Is There a Reproducibility Crisis?,” Nature, 533, 353–366.
  • Baker, M (2017), “Scientific Computing: Code Alert,” Nature, 541, 563–565.
  • Bakken, S. (2019), “The Journey to Transparency, Reproducibility, and Replicability,” Journal of the American Medical Informatics Association, 26, 185–187. DOI: 10.1093/jamia/ocz007.
  • Ball, P. (2017), “It’s Not Just You: Science Papers Are Getting Harder to Read,” Nature, 30. [online] Available at https://www.nature.com/news/it-s-not-just-you-science-papers-are-getting-harder-to-read-1.21751
  • Chandler, P., and Sweller, J. (1996), “Cognitive Load While Learning to Use a Computer Program,” Applied Cognitive Psychology, 10, 151–170. DOI: 10.1002/(SICI)1099-0720(199604)10:2<151::AID-ACP380>3.0.CO;2-U.
  • Curating for Reproducibility Consortium. (2017), “Defining ‘reproducibility’,” Available at cure.web.unc.edu/defining-reproducibility/.
  • Debruine, L., and Taylor, J. (2019), “PsyTeachR - University of Glasgow School of Psychology and Neuroscience,” Available at https://psyteachr.github.io/.
  • Dobzhansky, T. (1973), “Nothing in Biology Makes Sense Except in the Light of Evolution,” The American Biology Teacher, 35, 125–129. DOI: 10.2307/4444260.
  • Dogucu, M., and Cetinkaya-Rundel, M. (2022), “Tools and Recommendations for Reproducible Teaching,” Technical Report. arXiv:2202.09504 [stat] type: article. Available at http://arxiv.org/abs/2202.09504.
  • Eglen, S. J. (2009), “A Quick Guide to Teaching R Programming to Computational Biology Students,” PloS Computational Biology, 5, e1000482. DOI: 10.1371/journal.pcbi.1000482.
  • Fecher, B., and Friesike, S. (2014), Open Science: One Term, Five Schools of Thought, pp. 17–47, Cham: Springer.
  • Felder, R. M., and Brent, R. (2009), “Active Learning: An Introduction,” ASQ Higher Education Brief, 2, 1–5.
  • Freeman, S., Eddy, S. L., McDonough, M., Smith, M. K., Okoroafor, N., Jordt, H., and Wenderoth, M. P. (2014), “Active Learning Increases Student Performance in Science, Engineering, and Mathematics,” Proceedings of the National Academy of Sciences, 111, 8410–8415. DOI: 10.1073/pnas.1319030111.
  • Frese, M. (1995), “Error Management in Training: Conceptual and Empirical Results,” in Organizational Learning and Technological Change,” Springer, pp. 112–124.
  • Fritzson, P., Gunnarsson, J., and Jirstrand, M. (2002), “MathModelica – An Extensible Modeling and Simulation Environment With Integrated Graphics and Literate Programming,” in 2nd International Modelica Conference, March 18-19, Munich, Germany.
  • Gabelica, M., Bojčić, R., and Puljak, L. (2022), “Many Researchers Were Not Compliant With Their Published Data Sharing Statement: Mixed-Methods Study,” Journal of Clinical Epidemiology, 150, 33–41. DOI: 10.1016/j.jclinepi.2022.05.019.
  • Gaspar, A., and Langevin, S. (2007), “Restoring “Coding With Intention” in Introductory Programming Courses,” in Proceedings of the 8th ACM SIGITE Conference on Information Technology Education, pp. 91–98.
  • Google Inc., and Gallup Inc. (2016), “Diversity Gaps in Computer Science: Exploring the Underrepresentation of Girls, Blacks and Hispanics,” Retrieved from http://goo.gl/PG34aH (Additional reports from Google’s Computer Science Education Research are available at g.co/cseduresearch).
  • Guzdial, M., and Barr, V. (2013), “The Lure of Live Coding; the Attraction of Small Data.” Communications of the ACM (Association of Computing Machinery), 56, 10–11. DOI: 10.1145/2534706.2534710.
  • Hilton III, J., Wiley, D., Stein, J., and Johnson, A. (2010), “The Four “R’s of Openness and ALMS Analysis: Frameworks for Open Educational Resources,” Open Learning: The Journal of Open, Distance and e-Learning, 25, 37–44.
  • Ioannidis, J. P. (2005), “Why Most Published Research Findings Are False,” PloS Medicine, 2, e124. DOI: 10.1371/journal.pmed.0020124.
  • Karimzadeh, M., and Hoffman, M. M. (2018), “Top Considerations for Creating Bioinformatics Software Documentation,” Briefings in Bioinformatics,19, 693–699. DOI: 10.1093/bib/bbw134.
  • KewalRamani, A., Zhang, J., Wang, X., Rathbun, A., Corcoran, L., Diliberti, M., and Zhang, J. (2018), “Student Access to Digital Learning Resources outside of the Classroom. NCES 2017-098,” National Center for Education Statistics.
  • King, G. (1995), “Replication, Replication,” PS: Political Science & Politics, 28, 444–452.
  • Knuth, D. E. (1984), “Literate Programming,” The Computer Journal, 27, 97–111. DOI: 10.1093/comjnl/27.2.97.
  • Krawczyk, M., and Reuben, E. (2012), “(Un) Available Upon Request: Field Experiment on Researchers’ Willingness to Share Supplementary Materials,” Accountability in Research, 19, 175–186.
  • Lambert, J., Kalyuga, S., and Capan, L. A. (2009), “Student Perceptions and Cognitive Load: What Can They Tell Us About e-Learning Web 2.0 Course Design?,” e-Learning and Digital Media, 6, 150–163. DOI: 10.2304/elea.2009.6.2.150.
  • Marks, S., Buckley, A., Reinhold, M., and Goetz, B. (2017), “JEP 277: Enhanced Deprecation,” JEP 277: Enhanced Deprecation. Available at http://openjdk.java.net/jeps/277.
  • Master, A., Meltzoff, A. N., and Cheryan, S. (2021), “Gender Stereotypes About Interests Start Early and Cause Gender Disparities in Computer Science and Engineering,” Proceedings of the National Academy of Sciences, 118, e2100030118. DOI: 10.1073/pnas.2100030118.
  • McDonnell, L., Barker, M. K., and Wieman, C. (2016), “Concepts First, Jargon Second Improves Student Articulation of Understanding,” Biochemistry and Molecular Biology Education, 44, 12–19. DOI: 10.1002/bmb.20922.
  • McTavish, E. J., Hinchliff, C. E., Allman, J. F., Brown, J. W., Cranston, K. A., Holder, M. T., Rees, J. A., and Smith, S. A. (2015), ‘Phylesystem: A Git-Based Data Store for Community-Curated Phylogenetic Estimates,” Bioinformatics, 31, 2794–2800. DOI: 10.1093/bioinformatics/btv276.
  • McTavish, E. J., Sánchez Reyes, L. L., and Holder, M. T. (2021), “OpenTree: A Python Package for Accessing and Analyzing Data from the Open Tree of Life,” Systematic Biology, 70, 1295–1301. 10.1093/sysbio/syab033.
  • Michonneau, F., Brown, J. W., and Winter, D. J. (2016), “rotl: an R Package to Interact With the Open Tree of Life Data,” Methods in Ecology and Evolution, 7, 1476–1481. DOI: 10.1111/2041-210X.12593.
  • Miyakawa, T. (2020), “No Raw Data, No Science: Another Possible Source of the Reproducibility Crisis.” Molecular Brain, 13, 1–6. DOI: 10.1186/s13041-020-0552-2.
  • National Academies of Sciences, Engineering, and Medicine. (2018), Data Science for Undergraduates: Opportunities and Options, Washington, DC: National Academies Press.
  • National Academies of Sciences, Engineering, and Medicine (2019), Reproducibility and Replicability in Science, Washington, DC: National Academies Press.
  • Nederbragt, A., Harris, R. M., Hill, A. P., and Wilson, G. (2020), “Ten Quick Tips for Teaching With Participatory Live Coding,” PloS Computational Biology, 16, e1008090. DOI: 10.1371/journal.pcbi.1008090.
  • NIGMS Career Curriculum Development. (2015), “Rigor & Reproducibility, National Institute of General Medical Sciences.” Available at https://www.nigms.nih.gov/training/instpredoc/Pages/admin-supplements-prev.aspx.
  • Open Tree Of Life, Redelings, B., Cranston, K. A., Allman, J., Holder, M. T., and McTavish, E. J. (2016), “Open Tree of Life APIs v3.0,” Open Tree of Life Project (online resources). Available at https://github.com/OpenTreeOfLife/germinator/wiki/Open-Tree-of-Life-Web-APIs.
  • Open Tree Of Life, Redelings, B., Sánchez Reyes, L. L., Cranston, K. A., Allman, J., Holder, M. T., and McTavish, E. J. (2019), “Open Tree of Life Synthetic Tree v12.3,” Zenodo. Available at DOI: 10.5281/zenodo.3937742..
  • Pan, S. C., Cooke, J., Little, J. L., McDaniel, M. A., Foster, E. R., Connor, L. T., and Rickard, T. C. (2019), “Online and Clicker Quizzing on Jargon Terms Enhances Definition-Focused but not Conceptually Focused Biology Exam Performance,” CBE-Life Sciences Education, 18, ar54. DOI: 10.1187/cbe.18-12-0248.
  • Peng, R. (2015), “The Reproducibility Crisis in Science: A Statistical Counterattack,” Significance, 12, 30–32. DOI: 10.1111/j.1740-9713.2015.00827.x.
  • Peng, R. D. (2011), “Reproducible Research in Computational Science,” Science, 334, 1226–1227. DOI: 10.1126/science.1213847.
  • Piccolo, S. R., and Frampton, M. B. (2016), “Tools and Techniques for Computational Reproducibility,” Gigascience, 5, S13742–S016. DOI: 10.1186/s13742-016-0135-4.
  • Piwowar, H. (2013), “Value All Research Products,”’ Nature, 493, 159–159. DOI: 10.1038/493159a.
  • Pop, M., and Salzberg, S. L. (2015), “Use and Mis-Use of Supplementary Material in Science Publications.” BMC Bioinformatics, 16, 1–4.
  • Powers, S. M., and Hampton, S. E. (2019), “Open Science, Reproducibility, and Transparency in Ecology,” Ecological Applications, 29, e01822.
  • Prinz, F., Schlange, T., and Asadullah, K. (2011), “Believe it or Not: How Much Can We Rely on Published Data on Potential Drug Targets?,” Nature Reviews Drug Discovery, 10, 712–712.
  • Rees, J. A., and Cranston, K. (2017), “Automated Assembly of a Reference Taxonomy for Phylogenetic Data Synthesis,” Biodiversity Data Journal 5, e12581.
  • Roland, M.-C., Chèvre, A.-M., Chadoeuf, J., Hubert, B., and Bonnemaire, J. (2002), “Think Forward, Act Now: Training Young Researchers for Sustainability. Reshaping the Relationship Between PhD Student and Adviser,” in 5. International COPERNICUS Conference number 8, VAS Verlag für Akademische Schriften.
  • Sánchez Reyes, L., McTavish, E., and Holder, M. (2021), “Using the Open Tree of Life for your Research, with R v0.9.1,” Open Tree of Life (online resources). AVailable at https://mctavishlab.github.io/R_OpenTree_tutorials/.
  • Sandve, G. K., Nekrutenko, A., Taylor, J., and Hovig, E. (2013), “Ten Simple Rules for Reproducible Computational Research,” PloS Computational Biology, 9, e1003285.
  • Sayres, M. A. W., Hauser, C., Sierk, M., Robic, S., Rosenwald, A. G., Smith, T. M., Triplett, E. W., Williams, J. J., Dinsdale, E., Morgan, W. R. et al. (2018), “Bioinformatics Core Competencies for Undergraduate Life Sciences Education,” PloS One, 13, e0196878.
  • Selvaraj, A., Zhang, E., Porter, L., and Soosai Raj, A. G. (2021), “Live Coding: A Review of the Literature,” in Proceedings of the 26th ACM Conference on Innovation and Technology in Computer Science Education V. 1, pp. 164–170.
  • Shannon, A., and Summet, V. (2015), “Live Coding in Introductory Computer Science Courses,” Journal of Computing Sciences in Colleges, 31, 158–164.
  • Steele-Johnson, D., and Kalinoski, Z. T. (2014), “Error Framing Effects on Performance: Cognitive, Motivational, and Affective Pathways,” The Journal of Psychology, 148, 93–111.
  • Sweller, J. (1988), “Cognitive Load During Problem Solving: Effects on Learning,” Cognitive Science, 12, 257–285.
  • The Turing Way Community. (2021), The Turing Way: A handbook for reproducible, ethical and collaborative research (1.0.1). Zenodo. DOI: 10.5281/zenodo.6533831.
  • University of Washington Libraries. (2022), “Teaching Reproducibility,” Available at https://guides.lib.uw.edu/research/reproducibility/teaching.
  • Vadlamani, A., Kalicheti, R., and Chimalakonda, S. (2021), “Apiscanner-towards Automated Detection of Deprecated Apis in Python Libraries,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), IEEE, pp. 5–8.
  • van der Meijden, A., Vredenburg, V. T., Sopory, A., Petirs, B., Tiwari, R., and Wake, D. B. (2002), “AmphibiaWeb: An Information System for Amphibian Conservation Biology. AnfibiosWeb: Un Sistema de Información Para la Biología de Conservación de Anfibios,” in Annual Meeting of the Society for Integrative and Comparative Biology, Anaheim, CA, US, January 02-06, 2002.
  • Van Merriënboer, J. J., and Ayres, P. (2005), “Research on Cognitive Load Theory and its Design Implications for e-Learning,” Educational Technology Research and Development, 53, 5–13.
  • Warner, J. R., Childs, J., Fletcher, C. L., Martin, N. D., and Kennedy, M. (2021), “Quantifying Disparities in Computing Education: Access, Participation, and Intersectionality,” in Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, pp. 619–625.
  • Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J. J., Appleton, G. I., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P., Bouwman, J., Brookes, A., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C., Finkers, R., Gonzalez-Beltran, A., Gray, AJ., Groth, P., Goble, C., Grethe, J., Heringa, J., ’t Hoen, P.A., Hooft, R., Kuhn, T., Kok, R., Kok, J., Lusher, S., Martone, M., Mons, A., Packer, A., Persson, B., Rocca-Serra, P., Roos, M., van Schaik, R., Sansone, S-A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M., Thompson, M., and van der Lei, J. (2016), “The FAIR Guiding Principles for Scientific Data Management and Stewardship,” Scientific Data, 3, 1–9.
  • Williams, J. J., Drew, J. C., Galindo-Gonzalez, S., Robic, S., Dinsdale, E., Morgan, W. R., Triplett, E. W., Burnette III, J. M., Donovan, S. S., Fowlks, E. R., Goodman, A. L., Grandgenett, N. F., Goller, C. C., Hauser, C., Jungck, J. R., Newman, J. D., Pearson, W. R., Ryder, E. F., Sierk, M., Smith, T. M., Tosado-Acevedo, R., Tapprich, W., Tobin, T. C., Toro-Martínez, A., R. Welch, L. R., Wilson, M. A., Ebenbach, D., McWilliams, M., Rosenwald, A. G., and Pauley, M. A. (2019), “Barriers to Integration of Bioinformatics Into Undergraduate Life Sciences Education: a National Study of US Life Sciences Faculty Uncover Significant Barriers to Integrating Bioinformatics Into Undergraduate Instruction,” PloS One, 14, e0224288.
  • Wilson, G. (2006), “Software Carpentry: Getting Scientists to Write Better Code by Making Them More Productive,” Computing in Science & Engineering, 8, 66–69.
  • Wilson, G (2016), “Software Carpentry: Workshop Template v2016.06.” Available at https://github.com/carpentries/workshop-template.
  • Wilson, G (2019), Teaching Tech Together: How to Make your Lessons Work and Build a Teaching Community Around Them, Boca Raton, FL: CRC Press.
  • Wilson, G (2022), “The Carpentries,” Website. Available at http://software-carpentry.org.
  • Wilson, G., Bryan, J., Cranston, K., Kitzes, J., Nederbragt, L., and Teal, T. K. (2017), “Good Enough Practices in Scientific Computing,” PloS Computational Biology, 13, e1005510.
  • Wright, A. M., Schwartz, R. S., Oaks, J. R., Newman, C. E., and Flanagan, S. P. (2019), “The Why, When, and How of Computing in Biology Classrooms,” F1000Research, 8, 1854.