Search in:

Journal of Statistics and Data Science Education Volume 29, 2021 - Issue sup1: Computing in the Statistics and Data Science Curriculum

Submit an article Journal homepage

Open access

3,117

Views

CrossRef citations to date

Altmetric

Articles

Expanding the Scope of Statistical Computing: Training Statisticians to Be Software Engineers

Alex ReinhartDepartment of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PACorrespondence[email protected]

https://orcid.org/0000-0002-6658-514X

Christopher R. GenoveseDepartment of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA

Pages S7-S15 | Published online: 22 Mar 2021

Cite this article
https://doi.org/10.1080/10691898.2020.1845109

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Barry, T. (2018), “Collections in R: Review and Proposal,” The R Journal, 10, 455–471, DOI: 10.32614/RJ-2018-037.
Google Scholar
Baumer, B. S., Çetinkaya-Rundel, M., Bray, A., Loi, L., and Horton, N. J. (2014), “R Markdown: Integrating a Reproducible Analysis Tool Into Introductory Statistics,” Technology Innovations in Statistics Education, 8, available at https://escholarship.org/uc/item/90b2f5xh.
Google Scholar
Behnel, S., Bradshaw, R., Citro, C., Dalcin, L., Seljebotn, D. S., and Smith, K. (2011), “Cython: The Best of Both Worlds,” Computing in Science Engineering, 13, 31–39, DOI: 10.1109/MCSE.2010.118.
Web of Science ®Google Scholar
Beller, M., Bacchelli, A., Zaidman, A., and Juergens, E. (2014), “Modern Code Reviews in Open-Source Projects: Which Problems Do They Fix?,” in Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 202–211. DOI: 10.1145/2597073.2597082.
Google Scholar
Bentley, J. L. (1975), “Multidimensional Binary Search Trees Used for Associative Searching,” Communications of the ACM, 18, 509–517, DOI: 10.1145/361002.361007.
Web of Science ®Google Scholar
Bion, R., Chang, R., and Goodman, J. (2018), “How R Helps Airbnb Make the Most of Its Data,” The American Statistician, 72, 46–52, DOI: 10.1080/00031305.2017.1392362.
Web of Science ®Google Scholar
Bissi, W., Neto, A. G. S. S., and Emer, M. C. F. P. (2016), “The Effects of Test Driven Development on Internal Quality, External Quality and Productivity: A Systematic Review,” Information and Software Technology, 74, 45–54, DOI: 10.1016/j.infsof.2016.02.004.
Web of Science ®Google Scholar
Breiman, L. (2001), “Random Forests,” Machine Learning, 45, 5–32, DOI: 10.1023/A:1010933404324.
Web of Science ®Google Scholar
Bryan, J. (2018), “Excuse Me, Do You Have a Moment to Talk About Version Control?,” The American Statistician, 72, 20–27, DOI: 10.1080/00031305.2017.1399928.
Web of Science ®Google Scholar
Çetinkaya-Rundel, M., and Rundel, C. (2018), “Infrastructure and Tools for Teaching Computing Throughout the Statistical Curriculum,” The American Statistician, 72, 58–65, DOI: 10.1080/00031305.2017.1397549.
Web of Science ®Google Scholar
CMU Career & Professional Development Center (2018), “First Destination Outcomes: Dietrich College Statistics & Data Science, Bachelor’s,” available at https://www.cmu.edu/career/documents/2018_one_pagers/dc/Bachelors/%20Stats.pdf.
Google Scholar
Eddelbuettel, D., and Francois, R. (2011), “Rcpp: Seamless R and C++ Integration,” Journal of Statistical Software, 40, 1–18, DOI: 10.18637/jss.v040.i08.
Web of Science ®Google Scholar
Eklund, A., Nichols, T. E., and Knutsson, H. (2016), “Cluster Failure: Why fMRI Inferences for Spatial Extent Have Inflated False-Positive Rates,” Proceedings of the National Academy of Sciences of the United States of America, 113, 7900–7905, DOI: 10.1073/pnas.1602413113.
PubMed Web of Science ®Google Scholar
Fiksel, J., Jager, L. R., Hardin, J. S., and Taub, M. A. (2019), “Using GitHub Classroom to Teach Statistics,” Journal of Statistics Education, 27, 110–119, DOI: 10.1080/10691898.2019.1617089.
Web of Science ®Google Scholar
Freeman, S., Eddy, S. L., McDonough, M., Smith, M. K., Okoroafor, N., Jordt, H., and Wenderoth, M. P. (2014), “Active Learning Increases Student Performance in Science, Engineering, and Mathematics,” Proceedings of the National Academy of Sciences of the United States of America, 111, 8410–8415, DOI: 10.1073/pnas.1319030111.
PubMed Web of Science ®Google Scholar
Gray, A. G., and Moore, A. W. (2003), “Nonparametric Density Estimation: Toward Computational Tractability,” in Proceedings of the 2003 SIAM International Conference on Data Mining, pp. 203–211. DOI: 10.1137/1.9781611972733.19.
Google Scholar
Greenhouse, J. B., and Seltman, H. J. (2018), “On Teaching Statistical Practice: From Novice to Expert,” The American Statistician, 72, 147–154, DOI: 10.1080/00031305.2016.1270230.
Web of Science ®Google Scholar
Johnson, N. A. (2013), “A Dynamic Programming Algorithm for the Fused Lasso and L0-Segmentation,” Journal of Computational and Graphical Statistics, 22, 246–260, DOI: 10.1080/10618600.2012.681238.
Web of Science ®Google Scholar
Jordan, M. I. (2013), “On Statistics, Computation and Scalability,” Bernoulli, 19, 1378–1390, DOI: 10.3150/12-BEJSP17.
Web of Science ®Google Scholar
Kulik, C.-L. C., Kulik, J. A., and Bangert-Drowns, R. L. (1990), “Effectiveness of Mastery Learning Programs: A Meta-Analysis,” Review of Educational Research, 60, 265–299, DOI: 10.3102/00346543060002265.
Web of Science ®Google Scholar
Leisch, F. (2002), “Sweave: Dynamic Generation of Statistical Reports Using Literate Data Analysis,” in Compstat 2002—Proceedings in Computational Statistics, eds. W. Härdle and B. Rönz, Heidelberg: Physica Verlag, pp. 575–580.
Google Scholar
Liu, F. T., Ting, K. M., and Zhou, Z.-H. (2012), “Isolation-Based Anomaly Detection,” ACM Transactions on Knowledge Discovery From Data, 6, 1–39, DOI: 10.1145/2133360.2133363.
Web of Science ®Google Scholar
Mäntylä, M. V., and Lassenius, C. (2009), “What Types of Defects Are Really Discovered in Code Reviews?,” IEEE Transactions on Software Engineering, 35, 430–448, DOI: 10.1109/TSE.2008.71.
Web of Science ®Google Scholar
National Academies of Sciences, Engineering, and Medicine (2018), Data Science for Undergraduates: Opportunities and Options, Washington, DC: The National Academies Press.
Google Scholar
Nolan, D., and Temple Lang, D. (2009), “Integrating Computing Into the Statistics Curricula,” available at https://www.stat.berkeley.edu/statcur/.
Google Scholar
——— (2010), “Computing in the Statistics Curricula,” The American Statistician, 64, 97–107, DOI: 10.1198/tast.2010.09132.
Web of Science ®Google Scholar
Rigby, P. C., and Bird, C. (2013), “Convergent Contemporary Software Peer Review Practices,” in Proceedings of the 9th Joint Meeting on Foundations of Software Engineering, pp. 202–212.
Google Scholar
Rossini, A. J. (2001), “Literate Statistical Practice,” in Proceedings of the 2nd International Workshop on Distributed Statistical Computing, eds. K. Hornik and F. Leisch.
Google Scholar
Sadowski, C., Söderberg, E., Church, L., Sipko, M., and Bacchelli, A. (2018), “Modern Code Review: A Case Study at Google,” in Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, pp. 181–190.
Google Scholar
Wang, A. L.-C. (2003), “An Industrial-Strength Audio Search Algorithm,” in Proceedings of the 4th International Conference on Music Information Retrieval.
Google Scholar
Wayne, K. (2016), “Autocomplete-Me,” in SIGCSE Nifty Assignments, available at http://nifty.stanford.edu/2016/wayne-autocomplete-me/.
Google Scholar
Wickham, H. (2011), “testthat: Get Started With Testing,” The R Journal, 3, 5–10. DOI: 10.32614/RJ-2011-002.
Google Scholar
——— (2014), “Tidy Data,” Journal of Statistical Software, 59, 1–23, DOI: 10.18637/jss.v059.i10.
PubMed Web of Science ®Google Scholar
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D. A., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., Takahashi, K., Vaughan, D., Wilke, C., Woo, K., and Yutani, H. (2019), “Welcome to the Tidyverse,” Journal of Open Source Software, 4, 1686, DOI: 10.21105/joss.01686.
Google Scholar
Williams, L., Maximilien, E. M., and Vouk, M. (2003), “Test-Driven Development as a Defect-Reduction Practice,” in 14th International Symposium on Software Reliability Engineering, pp. 34–45.
Google Scholar
Xie, Y. (2015), Dynamic Documents With R and knitr (2nd ed.), Boca Raton, FL: Chapman and Hall/CRC.
Google Scholar
Xie, Y., Allaire, J. J., and Grolemund, G. (2018), R Markdown: The Definitive Guide, Boca Raton, FL: Chapman and Hall/CRC.
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Expanding the Scope of Statistical Computing: Training Statisticians to Be Software Engineers

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Expanding the Scope of Statistical Computing: Training Statisticians to Be Software Engineers

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date