Abstract
Textbooks and websites today abound with real data. One neglected issue is that statistical investigations often require a good deal of “cleaning” to ready data for analysis. The purpose of this dataset and exercise is to teach students to use exploratory tools to identify erroneous observations. This article discusses the merits of such an exercise and provides a team project, problem data, cleaned data for instructors, and reflections on past experiences. The main goal is to give instructors a prepared project for their students to perform realistic data preparation and subsequent analysis. The data for this project involve categorical and continuous variables for subjects age 65 and over testing calcium, inorganic phosphorus, and alkaline phosphatase levels in the blood. The project described in this article involves summary analysis, but the cleaned data could also be used for projects on independent samples t-tests, analysis of variance, or regression.
Acknowledgements
The authors wish to thank the reviewers and the Datasets and Stories editor for their helpful comments.