Abstract
At first blush, it appears that statistics and data quality should be perfect together. After all, statistical practitioners depend on high-quality data to conduct their analyses, and many data quality efforts seem well-suited to the application of statistical methods. Yet, the facts on the ground suggest otherwise: statistical applications, quality improvement projects, and data science initiatives are all plagued by bad data (Kenett and Redman Citation2019). Worse still, these respective communities have too often viewed data quality as uninteresting “grunt work,” and shown little interest in systematic improvement. But why? This article explores this quandary in detail, diagnoses the root causes, and shows that resolving these causes presents an enormous opportunity.
Additional information
Notes on contributors
Thomas Redman
Thomas C. Redman, “the Data Doc,” is President of Data Quality Solutions, based in Rumson, NJ. He helps companies and people, including startups, multinationals, executives, and leaders at all levels, chart their courses to data-driven futures. He places special emphasis on quality, analytics, and organizational capabilities. He has published extensively on improving data quality in the business literature, including the Harvard Business Review and the Sloan Managment Review, and is the author of several books, most recently The Real Work of Data Science, coauthored with Ron Kenett, published in 2019.
Roger Hoerl
Dr. Roger W. Hoerl is Brate-Peschel Associate Professor of Statistics at Union College, in Schenectady, NY. Previously, he led the Applied Statistics Lab at GE Global Research. Dr. Hoerl has been named a Fellow of the American Statistical Association and the American Society for Quality, and has been elected to the International Statistical Institute and International Academy for Qualty. He has received the Brumbaugh and Hunter Awards, as well as the Shewhart Medal, from the American Society for Quality, and the Founders Award and Deming Lectureship Award from the American Statistical Association. His introductory text Statistical Thinking: Improving Business Performance, coauthored with Ronald Snee and now in its 3rd edition, was described as “…probably the most practical basic statistics textbook that has ever been written within a business context” by the journal Technometrics.