ABSTRACT
This tutorial provides a gentle introduction to kernel density estimation (KDE) and recent advances regarding confidence bands and geometric/topological features. We begin with a discussion of basic properties of KDE: the convergence rate under various metrics, density derivative estimation, and bandwidth selection. Then, we introduce common approaches to the construction of confidence intervals/bands, and we discuss how to handle bias. Next, we talk about recent advances in the inference of geometric and topological features of a density function using KDE. Finally, we illustrate how one can use KDE to estimate a cumulative distribution function and a receiver operating characteristic curve. We provide R implementations related to this tutorial at the end.
Acknowledgments
We thank two referees for the very useful suggestions. We also thank Gang Chen, Aurelio Uncini, and Larry Wasserman for useful comments.
Disclosure statement
No potential conflict of interest was reported by the author.
Notes
4. R source code: https://github.com/yenchic/HDLV
5. R source code: https://github.com/yenchic/Morse_Smale
Additional information
Notes on contributors
Yen-Chi Chen
Yen-Chi Chen is an assistant professor in the Department of Statistics, a data science fellow in the eScience Institute, and a statistician in the National Alzheimer's Coordinating Center at the University of Washington. His research focuses on nonparametric statistics, cluster analysis, topological data analysis, and applications in various fields.