References
- AutonomIQ (2019), “ChroPath for Firefox 5.0.9,” available at https://addons.mozilla.org/en-US/firefox/addon/chropath-for-firefox.
- Bache, S. M., and Wickham, H. (2014), “magrittr: A Forward-Pipe Operator for R,” R Package Version 1.5, available at https://CRAN.R-project.org/package=magrittr.
- Baumer, B. S., Garcia, R. L., Kim, A. K., Kinnaird, K. M., and Ott, M. Q. (2020), “Integrating Data Science Ethics Into an Undergraduate Major,” arXiv no. 2001.07649v1.
- Brantley, J. (2016), “ZillowR: R Interface to Zillow Real Estate and Mortgage Data API,” R Package Version 0.1.0, available at https://CRAN.R-project.org/package=ZillowR.
- Cantino, A., and Maxwell, K. (2013), “SelectorGadget: Point and Click CSS Selectors,” available at https://selectorgadget.com.
- Cobb, G. (1992), “Teaching Statistics” (Chapter 1), in Heeding the Call for Change Suggestions for Curricular Action, ed. L. A. Steen, Oxford: The Mathematical Association of America, pp. 3–43.
- Destatis (2018), “Methods—Approaches—Developments.”
- Distil Networks (2016), “Economics of Web Scraping Report.”
- Dumbacher, B., and Capps, C. (2016), “Big Data Methods for Scraping Government Tax Revenue From the Web,” in Proceedings of the Joint Statistical Meetings, Section on Statistical Learning and Data Science, pp. 2940–2954.
- GAISE (2005), “Guidelines for Assessment and Instruction in Statistics Education (GAISE): College Report,” available at http://www.amstat.org/education/gaise.
- GAISE (2016), “Guidelines for Assessment and Instruction in Statistics Education (GAISE): College Report,” available at http://www.amstat.org/education/gaise.
- Google (2019), “Googlebot,” available at https://support.google.com/webmasters/answer/182072?hl=en.
- Grimshaw, S. D. (2015), “A Framework for Infusing Authentic Data Experiences Within Statistics Courses,” The American Statistician, 69, 307–314. DOI: 10.1080/00031305.2015.1081106.
- Hardin, J., Hoerl, R., Horton, N. J., Nolan, D., Baumer, B., Hall-Holt, O., Murrell, P., Peng, R., Roback, P., Temple Lang, D., and Ward, M. D. (2015), “Data Science in Statistics Curricula: Preparing Students to ‘Think With Data’,” The American Statistician, 69, 343–353. DOI: 10.1080/00031305.2015.1077729.
- Henry, L., and Wickham, H. (2020), “purrr: Functional Programming Tools,” R Package Version 0.3.4, available at https://CRAN.R-project.org/package=purrr.
- Hicks, S. C., and Irizarry, R. A. (2018), “A Guide to Teaching Data Science,” The American Statistician, 72, 382–391. DOI: 10.1080/00031305.2017.1356747.
- Horton, N. J., Baumer, B. S., and Wickham, H. (2015), “Setting the Stage for Data Science: Integration of Data Management Skills in Introductory and Second Courses in Statistics,” Chance, available at https://chance.amstat.org/2015/04/setting-the-stage.
- IMDB (2019), “Feature Film, Released Between 2018-01-01 and 2018-12-31 (Sorted by Number of Votes Descending),” available at https://www.imdb.com/search/title/?title_type=feature&year=2018-01-01,2018-12-31&sort=num_votes,desc.
- Introduction to robots.txt (2019), https://support.google.com/webmasters/answer/6062608?hl=en.
- Kearney, M. W. (2019), “rtweet: Collecting and Analyzing Twitter Data,” Journal of Open Source Software, 4, 1829. R Package Version 0.7.0, available at https://joss.theoj.org/papers/10.21105/joss.01829.
- Loy, A., Kuiper, S., and Chihara, L. (2019), “Supporting Data Science in the Statistics Curriculum,” Journal of Statistics Education, 27, 2–11. DOI: 10.1080/10691898.2018.1564638.
- Meissner, P., and Run, K. (2018), “robotstxt: A robots.txt’ Parser and ‘Webbot’/‘Spider’/‘Crawler’ Permissions Checker,” R Package Version 0.6.2, available at https://CRAN.R-project.org/package=robotstxt
- Neumann, D. L., Hood, M., and Neumann, M. M. (2013), “Using Real-Life Data When Teaching Statistics: Student Perceptions of This Strategy in an Introductory Statistics Course,” Statistics Education Research Journal, 12, 59–70.
- Nolan, D., and Temple Lang, D. (2010), “Computing in the Statistics Curricula,” The American Statistician, 64, 97–107. DOI: 10.1198/tast.2010.09132.
- Nolan, D., and Temple Lang, D. (2014), XML and Web Technologies for Data Sciences With R, New York: Springer.
- Open Secrets—Foreign Connected PACs (2019), https://www.opensecrets.org/political-action-committees-pacs/foreign-connected-pacs.
- OpenSecrets.org (2019), https://www.opensecrets.org.
- Parry, J. (2019), “genius: Easily Access Song Lyrics From Genius.com,” R Package Version 2.2.0, available at https://CRAN.R-project.org/package=genius.
- Poggi, N., Berral, J. L., Moreno, T., Gavalda, R., and Torres, J. (2007), “Automatic Detection and Banning of Content Stealing Bots for e-Commerce,” in NIPS 2007 Workshop on Machine Learning in Adversarial Environments for Computer Security (Vol. 2).
- Polidoro, F., Giannini, R., Conte, R. L., Mosca, S., and Rossetti, F. (2015), “Web Scraping Techniques to Collect Data on Consumer Electronics and Airfares for Italian HICP Compilation,” Statistical Journal of the IAOS, 31, 165–176. DOI: 10.3233/sji-150901.
- Richardson, L. (2007), “Beautiful Soup Documentation.”
- Robertson, A. (2019), “Scraping Public Data From a Website Probably Isn’t Hacking, Says Court,” available at https://www.theverge.com/2019/9/10/20859399/linkedin-hiq-data-scraping-cfaa-lawsuit-ninth-circuit-ruling.
- Statistics Canada (2019), “Web Scraping,” available at https://www.statcan.gc.ca/eng/our-data/where/web-scraping.
- Stiving, M. (2017), “B2b Pricing Systems: Proving ROI,” in Innovation in Pricing, eds. A. Hinterhuber and S. M. Liouzu, London: Routledge, pp. 137–144.
- Ten Bosch, O., Windmeijer, D., van Delden, A., and van den Heuvel, G. (2018), “Web Scraping Meets Survey Design: Combining Forces,” in Big Data Meets Survey Science Conference, Barcelona, Spain.
- Wickham, H. (2014), “Tidy Data,” Journal of Statistical Software, 59, 1–23. DOI: 10.18637/jss.v059.i10.
- Wickham, H. (2019a), “rvest: Easily Harvest (Scrape) Web Pages,” R Package Version 0.3.5, available at https://CRAN.R-project.org/package=rvest.
- Wickham, H. (2019b), “stringr: Simple, Consistent Wrappers for Common String Operations,” R Package Version 1.4.0, available at https://CRAN.R-project.org/package=stringr.
- Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., Takahashi, K., Vaughan, D., Wilke, C., Woo, K., and Yutani, H. (2019), “Welcome to the Tidyverse,” Journal of Open Source Software, 4, 1686. DOI: 10.21105/joss.01686.
- Woollacott, E. (2016), “70,000 OkCupid Profiles Leaked, Intimate Details and All,” available at https://www.forbes.com/sites/emmawoollacott/2016/05/13/intimate-data-of-70000-okcupid-users-released/47645bf1e15d.
- Zamora, A. (2019), “Making Room for Big Data: Web Scraping and an Affirmative Right to Access Publicly Available Information Online,” Journal of Business, Entrepreneurship and the Law, 12, 203–228.