1,071
Views
5
CrossRef citations to date
0
Altmetric
Methodological Studies

Gather-Narrow-Extract: A Framework for Studying Local Policy Variation Using Web-Scraping and Natural Language Processing

ORCID Icon
Pages 685-706 | Received 01 Aug 2018, Accepted 02 Aug 2019, Published online: 06 Dec 2019

References

  • 17 U.S. Code § 107. (1992) Limitations on exclusive rights: Fair use.
  • Beattie, G., Laliberté, J. P., & Oreopoulos, P. (2018). Thrivers and divers: Using non-academic measures to predict college success and failure. Economics of Education Review, 62, 170–182. doi:10.1016/j.econedurev.2017.09.008
  • Berman, P., & McLaughlin, M. W. (1977). Federal programs supporting educational change: Vol 7. Factors affecting implementation and continuation (p. 238). Santa Monica, CA: The RAND Corporation.
  • Bettinger, E., Liu, J., & Loeb, S. (2016). Connections matter: How interactive peers affect students in online college courses. Journal of Policy Analysis and Management, 35(4), 932–954. doi:10.1002/pam.21932
  • Bird, S., Loper, E., & Klein, E. (2018). Natural Language Toolkit — NLTK 3.3 documentation. Retrieved July 27, 2018, from https://www.nltk.org/index.html
  • Coburn, C. E., Hill, H. C., & Spillane, J. P. (2016). Alignment and accountability in policy design and implementation. Educational Researcher, 45(4), 243–251. doi:10.3102/0013189X16651080
  • Cohen, D. K., & Spillane, J. P. (1992). Policy and practice: The relations between governance and instruction. Review of Research in Education, 18, 3–49. doi:10.2307/1167296
  • Dobbie, W., & Fryer, R. G. (2013). Getting beneath the veil of effective schools: Evidence from New York City. American Economic Journal: Applied Economics, 5(4), 28–60. doi:10.1257/app.5.4.28
  • Erickson, B. J., Korfiatis, P., Kline, T. L., Akkus, Z., Philbrick, K., & Weston, A. D. (2018). Deep learning in radiology: Does one size fit all? Journal of the American College of Radiology, 15(3), 521–526. doi:10.1016/j.jacr.2017.12.027
  • Firth, J. R. (1957). A synopsis of linguistic theory, 1930–1955. Studies in Linguistic Analysis, 1–32. Retrieved from http://annabellelukin.edublogs.org/files/2013/08/Firth-JR-1962-A-Synopsis-of-Linguistic-Theory-wfihi5.pdf
  • Friedl, J. E. F. (2002). Mastering regular expressions (2nd ed.). Sebastopol, CA: O’Reilly & Associates.
  • Gentzkow, M., Kelly, B. T., & Taddy, M. (2017). Text as data (NBER Working Paper Series No. 23276). Retrieved from https://www.nber.org/papers/w23276
  • Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267–231. doi:10.1093/pan/mps028
  • Haskins, R., & Baron, J. (2011). Building the connection between policy and evidence. Retrieved from http://coalition4evidence.org/wp-content/uploads/2011/09/Haskins-Baron-paper-on-fed-evid-based-initiatives-2011.pdf
  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Spring Series in Statistics (Second). Retrieved from http://www.springerlink.com/index/10.1007/b94608
  • Honnibal, M. (2017). spaCy · Industrial-strength natural language processing in Python. Retrieved July 19, 2018, from https://spacy.io/
  • Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In ECML’98 Proceedings of the 10th European Conference on Machine Learning (pp. 137–142). Chemnitz: Springer Verlag.
  • Jurafsky, D., & Martin, J. H. (2018). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition (3rd ed.). Upper Saddle River, NJ: Prentice Hall. Retrieved from https://web.stanford.edu/∼jurafsky/slp3/
  • Kelly, S., Olney, A. M., Donnelly, P., Nystrand, M., & D’Mello, S. K. (2018). Automatically measuring question authenticity in real-world classrooms. Educational Researcher, 47(7), 451–464. doi:10.3102/0013189X18785613
  • Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (pp. 1746–1751). Doha, Qatar: Association for Computational Linguistics.
  • Landers, R. N., Brusso, R. C., Cavanaugh, K. J., & Collmus, A. B. (2016). A primer on theory-driven web scraping: Automatic extraction of big data from the Internet for use in psychological research. Psychological Methods, 21(4), 475–492. doi:10.1037/met0000081
  • Loeb, S., & McEwan, P. J. (2006). An economic approach to education policy implementation. In M. Honig (Ed.), New directions in education policy implementation (pp. 169–186). Albany, NY: SUNY Press. Retrieved from https://cepa.stanford.edu/sites/default/files/LOEBandMCEWAN.pdf
  • Manning, C. D., Raghavan, P., & Schütze, H. (2009). An introduction to information retrieval. Cambridge, MA: Cambridge University Press.
  • Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge, MA: MIT Press. Retrieved from https://dl.acm.org/citation.cfm?id=311445
  • Mattmann, C. A., & Zitting, J. L. (2011). Tika in action. Shelter Island, NY: Manning Publications.
  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. Retrieved from http://ronan.collobert.com/senna/
  • Olney, A. M., Samei, B., Donnelly, P. J., & D’Mello, S. K. (2017). Assessing the dialogic properties of classroom discourse: Proportion models for imbalanced classes. Proceedings of EDM 2017 (pp. 162–167). Retrieved from http://educationaldatamining.org/EDM2017/proc_files/papers/paper_26.pdf
  • Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV.
  • Reitz, K. (2018). Requests 2.19.1 documentation. Retrieved July 19, 2018, from http://docs.python-requests.org/en/master/
  • Richardson, L. (2017). Beautiful soup documentation — Beautiful soup 4.4.0 documentation. Retrieved July 19, 2018, from https://www.crummy.com/software/BeautifulSoup/bs4/doc/
  • Shadish, W. R., Cook, T. D., & Cambell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.
  • St. Clair, T., Hallberg, K., & Cook, T. D. (2016). The validity and precision of the comparative interrupted time-series design. Journal of Educational and Behavioral Statistics, 41(3), 269–299. doi:10.3102/1076998616636854
  • Sun, M., Liu, J., Zhu, J., & LeClair, Z. (2019). Using a text-as-data approach to understand reform processes: A deep exploration of school improvement strategies (EdWorkingPaper). Retrieved from http://edworkingpapers.com/ai19-68
  • Texas Education Code. (2018). Retrieved from https://statutes.capitol.texas.gov/Docs/SDocs/EDUCATIONCODE.pdf
  • Wong, V. C., Valentine, J., & Miller-Bains, K. (2017). Empirical performance of covariates in education observational studies. Journal of Research on Educational Effectiveness, 10(1), 207–236. doi:10.1080/19345747.2016.1164781
  • Wong, V. C., Wing, C., Martin, D., & Krishnamachari, A. (2018). Did states use implementation discretion to reduce the stringency of NCLB? Evidence from a database of state regulations. Educational Researcher, 47(1), 9–33. doi:10.3102/0013189X17743230

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.