4,338
Views
52
CrossRef citations to date
0
Altmetric
Article

Packaging Data Analytical Work Reproducibly Using R (and Friends)

, &
Pages 80-88 | Received 01 May 2017, Published online: 24 Apr 2018

References

  • Allaire, J. et al. (2016), “rmarkdown: Dynamic Documents for R”, R package version 0.9.6, available at https://CRAN.R-project.org/package=rmarkdown
  • Allaire, J., R Foundation, H. Wickham et al. (2017), “rticles, R package,” version 0.4.1, available at https://CRAN.R-project.org/package=rticles
  • Allison, S. (2016), “Other People’s Data: Humanities Edition,” Journal of Cultural Analytics. Available at http://culturalanalytics.org/2016/12/other-peoples-data-humanities-edition/
  • Aust, F. (2016), “citr: ‘RStudio’ Add-in to Insert Markdown Citations,” R package, version 0.2.0, available at https://CRAN.R-project.org/package=citr
  • Ball, R., and Medeiros, N. (2012), “Teaching Integrity in Empirical Research: A Protocol for Documenting Data Management and Analysis,” The Journal of Economic Education, 43, 182–189.
  • Baumer, B., and Udwin, D. (2015), “R Markdown,” Wiley Interdisciplinary Reviews: Computational Statistics, 7, 167–177.
  • Begley, C. G., and Ellis, L. M. (2012), “Drug Development: Raise Standards for Preclinical Cancer Research,” Nature, 483, 531–533.
  • Blischak, J., P. Carbonetto, M. Stephens (2017), “workflowr: A Workflow Template for Creating a Research Website,” R package, Version: 0.11.0, available at https://github.com/jdblischak/workflowr
  • Boettiger, C. (2015), “An Introduction to Docker for Reproducible Research,” ACM SIGOPS Operating Systems Review, 49, 71–79.
  • Boettiger, C. and D. Eddelbuettel (2017), “An Introduction to Rocker: Docker Containers for R,” The R Journal, 9, 527–536.
  • Boettiger, C., Mangel, M., and Munch, S. (2015), “Avoiding Tipping Points in Fisheries Management Through Gaussian Process Dynamic Programming,” Proceedings of the Royal Society of London, Series B, 282, 20141631.
  • Bollen, K., Cacioppo, J., Kaplan, R., Krosnick, J., and Olds, J. (2015), “Social, Behavioral, and Economic Sciences Perspectives on Robust and Reliable Science: Report of the Subcommittee on Replicability in Science, Advisory Committee to the National Science Foundation Directorate for Social, Behavioral, and Economic Sciences,” Retrieved from the National Science Foundation, available at www.nsf.gov/sbe/AC_Materials/SBE_Robust_and_Reliable_Research_Report.pdf.
  • Broman, K. (2013), “minimal Make: A Minimal Tutorial on Make,” available at http://kbroman.org/minimal_make.
  • ——— (2016), “Organize Your Data and Code,” available at http://kbroman.org/steps2rr/pages/organize.html
  • Bryan, J. (2016), Stat545: Data Wrangling, Exploration, and Analysis With R, available at http://stat545.com/.
  • Buckheit, J. B., and Donoho, D. L. (1995), “Wavelab and Reproducible Research,” Wavelets and Statistics. Lecture Notes in Statistics, 103, 55–81.
  • Claerbout, J. F., and Karrenfach, M. (1992), “Electronic Documents Give Reproducible Research a New Meaning,” in Society of Exploration Geophysics Technical Program Expanded Abstracts, pp. 601--604, available at http://sepwww.stanford.edu/doku.php?id=sep:research:reproducible:seg92
  • Clarkson, C., Smith, M., Marwick, B., Fullagar, R., Wallis, L. A., Faulkner, P., Manne, T., Hayes, E., Roberts, R. G., and Jacobs, Z. (2015), “The Archaeology, Chronology and Stratigraphy of Madjedbebe (Malakunanja ii): A Site in Northern Australia With Early Occupation,” Journal of Human Evolution, 83, 46–64.
  • Collberg, C., and Proebsting, T. A. (2016), “Repeatability in Computer Systems Research,” Communications of the ACM, 59, 62–69.
  • Conrad, C., Higham, C., Eda, M., and Marwick, B. (2016), “Palaeoecology and Forager Subsistence Strategies During the Pleistocene Holocene Transition: A Reinvestigation of the Zooarchaeological Assemblage From Spirit Cave, Mae Hong Son Province, Thailand,” Asian Perspectives, 55, 2–27.
  • Curriculum, R. S. (2016), “The Organization Lesson for the Reproducible Science Curriculum,” available at https://github.com/Reproducible-Science-Curriculum/rr-organization1
  • Donoho, D. L. (2010), “An Invitation to Reproducible Computational Research,” Biostatistics, 11, 385–388. Available at http://biostatistics.oxfordjournals.org/content/11/3/385.short
  • Dorch, S. (2012), “On the Citation Advantage of Linking to Data: Astrophysics,” available at https://halshs.archives-ouvertes.fr/hprints-00714715/
  • Duffy, M. A., James, T. Y., and Longworth, A. (2015), “Ecology, Virulence, and Phylogeny of Blastulidium Paedophthorum, a Widespread Brood Parasite of Daphnia,” Applied and Environmental Microbiology. Available at http://aem.asm.org/content/early/2015/06/02/AEM.01369-15.abstract
  • Eglen, S. J. (2016), “Bivariate Spatial Point Patterns in the Retina: A Reproducible Review,” Journal de la Société Française de Statistique, 157, 33–48.
  • FitzJohn, R. (2016), Remake: Make-Like Build Management, R package version 0.2.0. Available at https://github.com/richfitz/remake
  • Gandrud, C. (2013), Reproducible Research With R and RStudio, Boca Raton, FL: CRC Press.
  • Gentleman, R., and Temple Lang, D. (2004), “Statistical Analyses and Reproducible Research,” Bioconductor Project Working Papers, p. 2.
  • ——— (2012), “Statistical Analyses and Reproducible Research,” Journal of Computational and Graphical Statistics, 16, 1–23.
  • Gentleman, R. (2005), “Reproducible Research: A Bioinformatics Case Study,” Statistical Applications in Genetics and Molecular Biology, 4, 1034.
  • Gleditsch, N. P., and Strand, H. (2003), “Posting Your Data: Will You be Scooped or Will You be Famous?” International Studies Perspectives, 4, 72–107.
  • Goldstone, A. (2017), “From Reproducible to Productive,” Journal of Cultural Analytics. Available at http://culturalanalytics.org/2017/02/from-reproducible-to-productive/
  • Goodman, S. N., Fanelli, D., and Ioannidis, J. P. A. (2016), “What Does Research Reproducibility Mean?” Science Translational Medicine, 8, 341ps12–341ps12.
  • Graham, S., Milligan, I., and Weingart, S. (2015), Exploring Big Historical Data: The Historian’s Macroscope, Hackensack, NJ: Imperial College Press.
  • Henneken, E. A., and Accomazzi, A. (2011), “Linking to Data: Effect on Citation Rates in Astronomy,” CoRR, abs/1111.3618. Available at http://arxiv.org/abs/1111.3618
  • Hollister, J. W., Milstead, W. B., and Kreakie, B. J. (2016), “Modeling Lake Trophic State: A Random Forest Approach,” Ecosphere, 7. e01321.
  • Johnston, L. (2016), “prodigenr: Research Project Directory Generator,” R package, version 0.3.0 Available at https://cran.r-project.org/package=prodigenr
  • Jones, Z. M. (2013), “Git/GitHub, Transparency, and Legitimacy in Quantitative Research,” The Political Methodologist, 21, 6–7. Available at http://zmjones.com/static/papers/git.pdf
  • Kamvar, Z. N., Amaradasa, B. S., Jhala, R., McCoy, S., Steadman, J. R., Everhart, S. E., (2017), “Population Structure and Phenotypic Variation of Sclerotinia sclerotiorum from Dry Bean (Phaseolus vulgaris) in the United States,” PeerJ, 5, e4152.
  • Kidwell, M. C., Lazarevi, L. B., Baranski, E., Hardwicke, T. E., Piechowski, S., Falkenberg, L.-S., Kennett, C., Slowik, A., Sonnleitner, C., Hess-Holden, C., Errington, T. M., Fiedler, S., and Nosek, B. A. (2016), “Badges to Acknowledge Open Practices: A Simple, Low-Cost, Effective Method for Increasing Transparency,” PLoS Biol, 14, 1–15.
  • Klein, M., Van de Sompel, H., Sanderson, R., Shankar, H., Balakireva, L., Zhou, K., and Tobin, R. (2014), “Scholarly Context Not Found: One in Five Articles Suffers From Reference Rot,” PLOS ONE, 9, 1–39.
  • Knuth, D. E. (1992), “Literate Programming,” CSLI Lecture Notes, Stanford, CA: Center for the Study of Language and Information (CSLI), 1992, 1.
  • Koenker, R. (1996), “Reproducible Econometric Research, Department of Econometrics, University Of Illinois, Urbana-Champaign,” Technical Report, IL, Tech. Rep. Available at http://www.econ.uiuc.edu/∼roger/research/repro/
  • Landau, W. M. (2017), “drake: Data Frames in R for Make,” R package, version 5.0.0, available at https://cran.r-project.org/package=drake
  • Leisch, F. (2002), ‘Sweave: Dynamic Generation of Statistical Reports Using Literate Data Analysis’, in Compstat, New York: Springer, pp. 575–580.
  • LeVeque, R. J., Mitchell, I. M., and Stodden, V. (2012), “Reproducible Research for Scientific Computing: Tools and Strategies for Changing the Culture,” Computing in Science & Engineering, 14, 13–17. Available at http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6171147
  • Loeliger, J., and McCullough, M. (2012), Version Control with Git: Powerful Tools and Techniques for Collaborative Software Development, Sebastopol, CA: O’Reilly Media, Inc.
  • Markowetz, F. (2015), “Five Selfish Reasons to Work Reproducibly,” Genome Biology, 16, 274.
  • Marwick, B. (2017), “Computational Reproducibility in Archaeological Research: Basic Principles and a Case Study of Their Implementation,” Journal of Archaeological Method and Theory, 24, 424.
  • Marwick, B., Clarkson, C., O’Connor, S., and Collins, S. (2016), “Early Modern Human Lithic Technology From Jerimalai, East Timor,” Journal of Human Evolution, 101, 45–64.
  • Marwick, B., Hayes, E., Clarkson, C., and Fullagar, R. (2017a), “Movement of Lithics by Trampling: An Experiment in the Madjedbebe Sediments, Northern Australia,” Journal of Archaeological Science, 79, 73–85.
  • Marwick, B., Van Vlack, H., Conrad, C., Shoocongdej, R., Thongcharoenchaikit, C., and Kwak, S. (2017b), “Adaptations to Sea Level Change and Transitions to Agriculture at Khao Toh Chong Rockshelter, Peninsular THailand,” Journal of Archaeological Science, 77, 94–108.
  • Marwick, B., Knitter, D., Kennedy, P., Muller-Scheessel, N., Hinz, M., Schmid, C., Braun, R., and Francuzik, W. (2017c), “rrtools: Tools for Writing Reproducible Research in R”, R package version 0.1.0. Available at https://github.com/benmarwick/rrtools
  • Marwick, B., JooYoung, S., and Bengtsson, H. (2017d), “wordcountaddin: Word Counts and Readability Statistics in R Markdown Documents,” R package, version 0.2.0, available at https://github.com/benmarwick/wordcountaddin
  • McKiernan, E. C., Bourne, P. E., Brown, C. T., Buck, S., Kenall, A., Lin, J., McDougall, D., Nosek, B. A., Ram, K., Soderberg, C. K., Spies, J. R., Thaney, K., Updegrove, A., Woo, K. H., and Yarkoni, T. (2016), “How Open Science Helps Researchers Succeed,” eLife, 5, e16800. Available at https://dx.doi.org/10.7554/eLife.16800
  • Microsoft Corporation (2016), “checkpoint: Install Packages From Snapshots on the Checkpoint Server for Reproducibility”, R package version 0.3.16. Available at https://CRAN.R-project.org/package=checkpoint
  • Munafo, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Percie du Sert, N., Simonsohn, U., Wagenmakers, E.-J., Ware, J. J., and Ioannidis, J. P. A. (2017), “A Manifesto for Reproducible Science,” Nature Human Behaviour, 1, 0021.
  • Negre, J., Munoz, F., and Lancelotti, C. (2016), “Geostatistical Modelling of Chemical Residues on Archaeological Floors in the Presence of Barriers,” Journal of Archaeological Science, 70, 91–101.
  • Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., Buck, S., Chambers, C. D., Chin, G., Christensen, G., Contestabile, M., Dafoe, A., Eich, E., Freese, J., Glennerster, R., Goroff, D., Green, D. P., Hesse, B., Humphreys, M., Ishiyama, J., Karlan, D., Kraut, A., Lupia, A., Mabry, P., Madon, T., Malhotra, N., Mayo-Wilson, E., McNutt, M., Miguel, E., Levy Paluck, E., Simonsohn, U., Soderberg, C., Spellman, B. A., Turitto, J., VandenBos, G., Vazire, S., Wagenmakers, E. J., Wilson, R., and Yarkoni, T. (2015), “Promoting an Open Research Culture: Author Guidelines for Journals Could Help to Promote Transparency, Openness, and Reproducibility,” Science, 348, 1422–1445.
  • Open Science Collaboration (2015), “Estimating the Reproducibility of Psychological Science,” Science, 349, aac4716.
  • Peng, R. D. (2009), “Reproducible Research and Biostatistics,” Biostatistics, 10, 405–408.
  • Peng, R. D., Dominici, F., and Zeger, S. L. (2006), “Reproducible Epidemiologic Research,” American Journal of Epidemiology, 163, 783–789.
  • Pienta, A. M., Alter, G. C., and Lyle, J. A. (2010), “The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data,” Available at https://deepblue.lib.umich.edu/handle/2027.42/78307
  • Piper, A. et al. (2017), “Data Sharing Policy,” Journal of Cultural Analytics. Available at http://culturalanalytics.org/about/about-ca/
  • Piwowar, H. A., Day, R. S., and Fridsma, D. B. (2007), “Sharing Detailed Research Data is Associated With Increased Citation Rate,” PLoS ONE, 2, e308. Available at http://dx.plos.org/10.1371/journal.pone.0000308
  • Piwowar, H. A., and Vision, T. J. (2013), “Data Reuse and the Open Data Citation Advantage,” PeerJ, 1, e175.
  • Ram, K. (2013), “Git can Facilitate Greater Reproducibility and Increased Transparency in Science,” Source Code for Biology and Medicine, 8, 7.
  • Rokem, A., Marwick, B., and Staneva, V. (2017), “Assessing Reproducibility,” in J. Kitzes, D. Turek, and F. Deniz, eds, The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences, CA: University of California Press.
  • rOpenSci et al. (2017), “Reproducibility Guide,” available at http://ropensci.github.io/reproducibility-guide/
  • Sandve, G. K., Nekrutenko, A., Taylor, J., and Hovig, E. (2013), “Ten Simple Rules for Reproducible Computational Research,” PLoS Comput Biol, 9, e1003285. Available at http://dx.doi.org/10.1371/journal.pcbi.1003285
  • Sears, J. (2011), “Data Sharing Effect on Article Citation Rate in Paleoceanography,” in AGU Fall Meeting Abstracts (Vol. 1), available at http://adsabs.harvard.edu/abs/2011AGUFMIN53B1628S, p. 1628.
  • Silverman, N. (2015), “makeProject: Creates an Empty Package Framework for the LCFD Format,” R package, version 1.0, available at https://CRAN.R-project.org/package=makeProject
  • Stanisic, L., and Legrand, A. (2014), “Effective Reproducible Research With Org-Mode and Git,” in Euro-Par 2014: Parallel Processing Workshops, Springer, pp. 475–486. Available at http://link.springer.com/chapter/10.1007/978-3-319-14325-5_41.
  • Stodden, V. (2009), “The Legal Framework for Reproducible Scientific Research: Licensing and Copyright,” Computing in Science & Engineering, 11, 35–40.
  • ——— (2014), “What Scientific Idea is Ready for Retirement? Reproducibility,” Edge. Available at https://www.edge.org/response-detail/25340
  • Stodden, V., McNutt, M., Bailey, D. H., Deelman, E., Gil, Y., Hanson, B., Heroux, M., Ioannidis, J. P., and Taufer, M. (2016), “Enhancing Reproducibility for Computational Methods,” Science, 354, 1240–1241.
  • Stodden, V., and Miguez, S. (2014), “Best Practices for Computational Science: Software Infrastructure and Environments for Reproducible and Extensible Research,” Journal of Open Research Software, 2. Available at http://openresearchsoftware.metajnl.com/articles/10.5334/jors.ay/
  • Teal, T. K., Cranston, K. A., Lapp, H., White, E., Wilson, G., Ram, K., and Pawlik, A. (2015), “Data Carpentry: Workshops to Increase Data Literacy for Researchers,” International Journal of Digital Curation, 10, 135–143.
  • Tippmann, S. (2014), “Programming Tools: Adventures With R,” Nature, 517, 109–110.
  • Ushey, K., McPherson, J., Cheng, J., Atkins, A., and Allaire, J. (n.d.), “packrat: A Dependency Management System for Projects and Their R Package Dependencies”, R package version 0.4.7-12. available at https://github.com/rstudio/packrat/
  • Vandewalle, P., Kovacevic, J., and Vetterli, M. (2009), “Reproducible Research in Signal Processing,” IEEE Signal Processing Magazine, 26, 37–47.
  • Vinod, H. (2001), “Care and Feeding of Reproducible Econometrics,” Journal of Econometrics, 100, 87–88.
  • White, J. M. (2014), “ProjectTemplate: Automates the Creation of New Statistical Analysis Projects”, R package version 0.6. Available at https://CRAN.R-project.org/package=ProjectTemplate
  • Wickham, H. (2015), R Packages (1st ed.), Sebastopol, CA: O’Reilly Media, Inc.
  • Wickham, H., and Chang, W. (2016), “devtools: Tools to Make Developing R Packages Easier”, R package version 1.12.0. Available at https://CRAN.R-project.org/package=devtools
  • Wilson, G. (2013), “Software Carpentry: Lessons Learned,” CoRR, abs/1307.5448. available at http://arxiv.org/abs/1307.5448
  • Wilson, G., Aruliah, D. A., Brown, C. T., Chue Hong, N. P., Davis, M., Guy, R. T., Haddock, S. H. D., Huff, K. D., Mitchell, I. M., Plumbley, M. D., Waugh, B., White, E. P., and Wilson, P. (2014), “Best Practices for Scientific Computing,” PLoS Biology, 12, e1001745. Available at http://dx.plos.org/10.1371/journal.pbio.1001745
  • Xie, Y. (2015), Dynamic Documents With R and knitr (2nd ed.), Boca Raton, FL: Chapman and Hall. ISBN 978-1498716963. Available at http://yihui.name/knitr/
  • ——— (2016), bookdown: Authoring Books and Technical Documents with R Markdown, Boca Raton, FL: Chapman and Hall/CRC.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.