Bayesian Regression Trees for High-Dimensional Prediction and Variable Selection: Journal of the American Statistical Association: Vol 113, No 522

6,337

Views

CrossRef citations to date

Altmetric

ABSTRACT

Decision tree ensembles are an extremely popular tool for obtaining high-quality predictions in nonparametric regression problems. Unmodified, however, many commonly used decision tree ensemble methods do not adapt to sparsity in the regime in which the number of predictors is larger than the number of observations. A recent stream of research concerns the construction of decision tree ensembles that are motivated by a generative probabilistic model, the most influential method being the Bayesian additive regression trees (BART) framework. In this article, we take a Bayesian point of view on this problem and show how to construct priors on decision tree ensembles that are capable of adapting to sparsity in the predictors by placing a sparsity-inducing Dirichlet hyperprior on the splitting proportions of the regression tree prior. We characterize the asymptotic distribution of the number of predictors included in the model and show how this prior can be easily incorporated into existing Markov chain Monte Carlo schemes. We demonstrate that our approach yields useful posterior inclusion probabilities for each predictor and illustrate the usefulness of our approach relative to other decision tree ensemble approaches on both simulated and real datasets. Supplementary materials for this article are available online.

Keywords:

Supplementary Material

Supplementary material includes all proofs, as well as the results of additional simulations and computational details. A developmental version of a modification of the bartMachine package of Kapelner and Bleich (Citation2016), used to implement the methodology, is also included.

Acknowledgements

The author thanks Fred Huffer for helpful discussions, as well as two anonymous reviewers whose comments helped improve this article.

Additional information

Funding

This research was supported by the Office of the Secretary of Defense, Directorate of Operational Test and Evaluation under the Science of Test research program, #SOT-FSU-FATs-06.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Bayesian Regression Trees for High-Dimensional Prediction and Variable Selection

Supplementary Material

Acknowledgements

Information for

Open access

Opportunities

Help and information

Bayesian Regression Trees for High-Dimensional Prediction and Variable Selection

ABSTRACT

Supplementary Material

Acknowledgements

Additional information

Funding

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature