Abstract
This article presents a new ensemble learning method for classification problems called projection pursuit random forest (PPF). PPF uses the PPtree algorithm where trees are constructed by splitting on linear combinations of randomly chosen variables. Projection pursuit is used to choose a projection of the variables that best separates the classes. Using linear combinations of variables to separate classes takes the correlation between variables into account which allows PPF to outperform a traditional random forest when separations between groups occurs in combinations of variables. The method presented here can be used in multi-class problems and is implemented into an R package, PPforest, which is available on CRAN. Supplementary files for this article are available online.
Supplementary Materials
This article was written with the R packages knitr (Xie Citation2015), ggplot2 (Wickham July 2009), and dplyr (Wickham et al. Citation2020), and the files to reproduce the article and results is available at https://github.com/natydasilva/PPforestpaper.
Acknowledgments
The authors are grateful for the helpful reviews provided by the editor, associate editor, and two anonymous reviewers.