10,547
Views
13
CrossRef citations to date
0
Altmetric
Data Science

A First Course in Data Science

&

References

  • Aggarwal, C. C., and Reddy, C. K. (2013), Data Clustering: Algorithms and Applications, Boca Raton, FL: Chapman and Hall.
  • American Statistical Association (2016), “Guidelines for Assessment and Instruction in Statistics Education (GAISE) College Report,” available at https://www.amstat.org/asa/files/pdfs/GAISE/GaiseCollege_Full.pdf.
  • Baumer, B. (2015), “A Data Science Course for Undergraduates: Thinking With Data,” American Statistician, 69, 334–342. DOI:10.1080/00031305.2015.1081105.
  • Box, G., and Cox, D. R. (1964), “An Analysis of Transformations,” Journal of the Royal Statistical Society, Series B, 26, 211–252. DOI:10.1111/j.2517-6161.1964.tb00553.x.
  • Breiman, L. (2001), “Statistical Modeling: The Two Cultures,” Statistical Science, 16, 199–231. DOI:10.1214/ss/1009213726.
  • Browne, M. N., and Keeley, S. M. (2007), Asking the Right Questions (11th ed.), Upper Saddle River, NJ: Pearson/Prentice Hall.
  • Chernoff, H. (1973), “The Use of Faces to Represent Points in k-Dimensional Space Graphically,” Journal of the American Statistical Association, 68, 361–368. DOI:10.1080/01621459.1973.10482434.
  • Cleveland, W. S. (2001), “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics,” International Statistical Review, 69, 21–26. DOI:10.1111/j.1751-5823.2001.tb00477.x.
  • Columbus, L. (2017), “IBM Predicts Demand for Data Scientists Will Soar 28% by 2020,” available at https://www.forbes.com.
  • Columbus, L. (2018), “Data Scientist Is the Best Job in America According to Glassdoor’s 2018 Rankings,” available at https://www.forbes.com.
  • Cumming, G. (2013), Understanding the New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis, New York: Routledge.
  • Davenport, T. H., and Patil, D. J. (2012), “Data Scientist: The Sexiest Job of the 21st Century”, Harvard Business Review, 90, 70–76.
  • Davydov, V., Zinchenko, V., and Talyzina, N. (1983), “The problem of activity in the works of A. N. Leontiev,” Soviet Psychology, 21, 31–42. DOI:10.2753/RPO1061-0405210431.
  • Donoho, D. (2015), “50 Years of Data Science”, in Tukey Centennial Workshop, Princeton, NJ, available at http://courses.csail.mit.edu/18.337/2015/docs/50YearsDataScience.pdf.
  • Donoho, D. (2017), “50 Years of Data Science,” Journal of Computational and Graphical Statistics, 26, 745–766. DOI:10.1080/10618600.2017.1384734.
  • Engestrom, Y. (1991), “Activity Theory and Individual and Social Transformation,” Multidisciplinary Newsletter for Activity Theory, 7/8, 6–17.
  • Engestrom, Y. (1999), “Activity Theory and Individual and Social Transformation,” in Perspectives on Activity Theory, eds. Y. Engestrom, R. Miettinen, and R.-L. Punamaki,Cambridge: Cambridge University Press, pp. 19–38.
  • Engestrom, Y. (2000), “Activity Theory as a Framework for Analyzing and Redesigning Work,” Ergonomics, 43, 960–974. DOI:10.1080/001401300409143.
  • Escobedo-Land, A., and Kim, A. Y. (2015), “OKCupid Data for Introductory Statistics and Data Science Courses,” Journal of Statistics Education, 23, 1–25. DOI:10.1080/10691898.2015.11889737.
  • Fraley, C., and Raftery, A. (2002), “Model-Based Clustering, Discriminant Analysis, and Density Estimation,” Journal of the American Statistical Association, 97, 611–631. DOI:10.1198/016214502760047131.
  • Grimshaw, S. (2015), “A Framework for Infusing Authentic Data Experiences Within Statistics Courses,” The American Statistician, 69, 307–314. DOI:10.1080/00031305.2015.1081106.
  • Guo, P. J. (2012), “Software Tools to Facilitate Research Programming,” Ph.D. dissertation, Stanford University.
  • Hardin, J., Hoerl, R., Horton, N. J., Nolan, D., Baumer, B., Hall-Holt, O., Murrell, P., Peng, R., Roback, P., Temple Lang, D., and Ward, M. D. (2015), “Data Science in Statistics Curricula: Preparing Students to ‘think with data’,” The American Statistician, 69, 343–353. DOI:10.1080/00031305.2015.1077729.
  • Hinton, G., and Salakhutdinov, R. (2006), “Reducing the Dimensionality of Data With Neural Networks,” Science, 313, 504–507. DOI:10.1126/science.1127647.
  • Horton, N. J., Baumer, B., and Wickham, H. (2015), “Setting the Stage for Data Science: Integration of Data Management Skills in Introductory and Second Courses in Statistics,” CHANCE, 28, 40–50. DOI:10.1080/09332480.2015.1042739.
  • Langer, A. M. (2012), Guide to Software Development: Designing and Managing the Life Cycle, London: Springer.
  • Lave, J. (1988), Cognition in Practice: Mind, Mathematics, and Culture in Everyday Life, New York: Cambridge University Press.
  • Lave, J., and Wenger, E. (1991), Situated Learning: Legitimate Peripheral Participation, New York: Cambridge University Press.
  • LeCun, Y., Bengio, Y., and Hinton, G. (2015), “Deep Learning,” Nature, 521, 436–444. DOI:10.1038/nature14539.
  • Leontiev, A. N. (1978), Activity, Consciousness, and Personality (originally published in Russian in 1975), Englewood Cliffs, NJ: Prentice-Hall.
  • Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., and Byers, A. H. (2011), Big Data: The Next Frontier for Innovation, Competition, and Productivity, New York: McKinsey Global Institute.
  • Nardi, B. (1996), Context and Consciousness: Activity Theory and Human-Computer Interaction, Cambridge, MA: MIT Press.
  • Nolan, D., and Perrett, J. (2016), “Teaching and Learning Data Visualization: Ideas and Assignments,” American Statistician, 70, 260–269. DOI:10.1080/00031305.2015.1123651.
  • Nolan, D., and Speed, T. (2000), Stat Labs: Mathematical Statistics through Applications, New York: Springer-Verlag.
  • Nolan, D., and Temple Lang, D. (2015), Data Science Case Studies in R: A Case Studies Approach to Computational Reasoning and Problem Solving, Boca Raton, FL: Chapman and Hall/CRC.
  • O’Neil, C., and Schutt, R. (2013), Doing Data Science: Straight Talk From the Frontline, Sebastopol, CA: O’Reilly Media.
  • Price, E., De Leone, C., and Lasry, N. (2010), “Comparing Educational Tools Using Activity Theory: Clickers and Flashcards,” in AIP Conference Proceedings (Vol. 1289), AIP, pp. 265–268.
  • PwC (2015), “What’s Next for the Data Science and Analytics Job Market?,” available at https://pwc.to/2FL8GEG.
  • Raeithel, A. (1991), “Semiotic Self-Regularization and Work: An Activity Theoretical Foundation of Design,” in Software Development and Reality Construction, ed. R. Floyd, New York: Springer-Verlag.
  • Simpson, W. J. (1957), “A Preliminary Report on Cigarette Smoking and the Incidence of Prematurity,” American Journal of Obstetrics and Gynecology, 73, 808–815. DOI:10.1016/0002-9378(57)90391-5.
  • Sisto, M. (2009), “Can You Explain That in Plain English? Making Statistics Group Projects Work in a Multicultural Setting,” Journal of Statistics Education, 17, 1–11. DOI:10.1080/10691898.2009.11889522.
  • Strehl, A., and Ghosh, J. (2003), “Cluster Ensembles—A Knowledge Reuse Framework for Combining Multiple Partitions,” The Journal of Machine Learning Research, 3, 583–617.
  • National Academies of Sciences, Engineering and Medicine Consensus Report (2018), “Data Science for Undergraduates: Opportunities and Options,” available at https://nas.edu/envisioningds.
  • Tishkovskaya, S., and Lancaster, G. A. (2012), “Statistical Education in the 21st Century: A Review of Challenges, Teaching Innovations and Strategies for Reform,” Journal of Statistics Education, 23, 1–56. DOI:10.1080/10691898.2012.11889641.
  • Tukey, J. W. (1977), Exploratory Data Analysis, Reading, MA: Addison-Wesley.
  • Verzani, J. (2008), “Using R in Introductory Statistics Courses With the pmg Graphical User Interface,” Journal of Statistics Education, 16, 1–17. DOI:10.1080/10691898.2008.11889558.
  • von Luxburg, U. (2007), “A Tutorial on Spectral Clustering,” Statistics and Computing, 17, 395–416. DOI:10.1007/s11222-007-9033-z.
  • Wickham, W., and Grolemund, G. (2016), R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, Sebastopol, CA: O’Reilly Media.
  • Wilcox, A. (2001), “On the Importance—and the Unimportance—of Birthweight,” International Journal of Epidemiology, 30, 1233–1241. DOI:10.1093/ije/30.6.1233.
  • Wild, C. J., and Pfannkuch, M. (1999), “Statistical Thinking in Empirical Enquiry,” International Statistical Review, 67, 223–265. DOI:10.1111/j.1751-5823.1999.tb00442.x.
  • Wu, C.-F. J. (1997), “Statistics = Data Science?,” in H. C. Carver Professorship Lecture, Ann Arbor, MI: The University of Michigan, available at http://www2.isye.gatech.edu/∼jeffwu/presentations/datascience.pdf.
  • Wu, C.-F. J. (1998), “Statistics = Data Science?,” in P. C. Mahalanobis Memorial Lecture, Kolkata: The Indian Statistical Institute.
  • Yan, D., Chen, A., and Jordan, M. I. (2013), “Cluster Forests,” Computational Statistics and Data Analysis, 66, 178–192. DOI:10.1016/j.csda.2013.04.010.
  • Yan, D., and Davis, G. E. (2018), “The Turtleback Diagram for Conditional Probability”, The Open Journal of Statistics, 8, 684–705. DOI:10.4236/ojs.2018.84045.
  • Yerushalmy, J. (1964), “Mother’s Cigarette Smoking and Survival of Infant,” American Journal of Obstetrics and Gynecology, 88, 505–518. DOI:10.1016/0002-9378(64)90509-5.
  • Yerushalmy, J. (1971), “The Relationship of Parents’ Cigarette Smoking to Outcome of Pregnancy—Implications as to the Problem of Inferring Causation From Observed Associations,” American Journal of Epidemiology, 93, 443–456. DOI:10.1093/oxfordjournals.aje.a121278.