300
Views
0
CrossRef citations to date
0
Altmetric
Taking a Chance in the Classroom

Spam Four Ways: Making Sense of Text Data

Pages 32-40 | Published online: 04 May 2022
 

Abstract

The world is full of text data, yet text analytics has not traditionally played a large part in statistics education. We consider four different ways to provide students with opportunities to explore whether email messages are unwanted correspondence (spam). Text from subject lines are used to identify features that can be used in classification. The approaches include use of a Model Eliciting Activity, exploration with CODAP, modeling with a specially designed Shiny app, and coding more sophisticated analyses using R. The approaches vary in their use of technology and code but all share the common goal of using data to make better decisions and assessment of the accuracy of those decisions.

Additional information

Notes on contributors

Nicholas J. Horton

Nicholas J. Horton is Beitzel Professor of Technology and Society (statistics and data science) at Amherst College. He earned his doctorate in biostatistics from the Harvard School of Public Health in 1999 and has co-authored a series of books about data science and statistical computing. He is a member of the ASA Board of Directors and co-chair of the National Academies Committee on Applied and Theoretical Statistics. This work is part of a larger project with this team that stemmed from Horton’s work as a Tinker Fellow with the Concord Consortium.

Jie Chao

Jie Chao is a learning scientist at the Concord Consortium. She earned her PhD in instructional technology and STEM education from the University of Virginia in 2012. Chao is the principal investigator of multiple NSF-funded projects on innovative approaches to STEM teaching and learning. Her research focuses on designing learning environments, helping students develop computational thinking skills, mathematical modeling competencies, and understanding artificial intelligence.

William Finzer

William Finzer is a senior scientist at the Concord Consortium, where he leads the development of CODAP. He serves as co-principal investigator on the NSF-funded StoryQ, M2Studio, and Boosting Data Fluency projects. Finzer’s work centers on bringing data science into the K–12 curriculum and integrated across subject areas through the creation of data exploration software designed to be accessible and usable in the classroom.

Phebe Palmer

Phebe Palmer is a recent graduate from Amherst College, having earned a BA in statistics in 2021. Her research centers largely on STEM education, having assisted with projects focused on approaches to statistics pedagogy, as well as equitable access to STEM curriculum. She works as a research assistant at SageFox Consulting Group, based in Amherst, Massachusetts.

Log in via your institution

Log in to Taylor & Francis Online

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 58.00 Add to cart

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.