Search in:

Advanced search

Journal of the American Statistical Association Volume 118, 2023 - Issue 541

Submit an article Journal homepage

1,886

Views

CrossRef citations to date

Altmetric

Theory and Methods

Variable Selection Via Thompson Sampling

Yi Liua Department of Statistics, University of Chicago, Chicago, IL;

Veronika Ročkováb Booth School of Business, University of Chicago, Chicago, ILCorrespondence[email protected]

Pages 287-304 | Received 25 Jun 2020, Accepted 23 Mar 2021, Published online: 06 Jul 2021

Cite this article
https://doi.org/10.1080/01621459.2021.1928514
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Agrawal, S. and Goyal, N. (2012), “Analysis of Thompson Sampling for the Multi-Armed Bandit Problem,” in COLT 2012: 25th Annual Conference on Learning Theory in Edinburgh.
Google Scholar
Barber, R. F., Candès, E. J. (2015), “Controlling the False Discovery Rate Via Knockoffs,” The Annals of Statistics , 43, 2055–2085. DOI: 10.1214/15-AOS1337.
Web of Science ®Google Scholar
Barbieri, M., Berger, J. O., George, E. I., and Ročková, V. (2020), “The Median Probability Model and Correlated Variables,” Bayesian Analysis (to appear). DOI: 10.1214/20-BA1249.
Web of Science ®Google Scholar
Barbieri, M. M., and Berger, J. O. (2004), “Optimal Predictive Model Selection,” Annals of Statistics , 32 , 870–897.
Web of Science ®Google Scholar
Bhattacharya, A., Chakraborty, A., and Mallick, B. K. (2016), “Fast Sampling With Gaussian Scale Mixture Priors in High-Dimensional Regression,” Biometrika , 103 , 985. DOI: 10.1093/biomet/asw042.
PubMed Web of Science ®Google Scholar
Bleich, J., Kapelner, A., George, E., and Jensen, S. (2014), “Variable Selection for BART: An Application to Gene Regulation,” Annals of Applied Statistics , 8, 1750–1781.
Web of Science ®Google Scholar
Bottolo, L., and Richardson, S. (2010), “Evolutionary Stochastic Search for Bayesian Model Exploration,” Bayesian Analysis , 5 , 583–618. DOI: 10.1214/10-BA523.
Web of Science ®Google Scholar
Breiman, L. (2001), “Random Forests,” Machine Learning , 45, 5–32. DOI: 10.1023/A:1010933404324.
Web of Science ®Google Scholar
Brown, P. J., Vannucci, M., and Fearn, T. (1998), “Multivariate Bayesian Variable Selection and Prediction,” Journal of the Royal Statistical Society, Series B, 60 , 627–641. DOI: 10.1111/1467-9868.00144.
Google Scholar
Bubeck, S., Munos, R., and Stoltz, G. (2009), “Pure Exploration in Multi-Armed Bandits Problems,” in International Conference on Algorithmic Learning Theory, Porto, Portugal.
Google Scholar
Bubeck, S., Wang, T., and Viswanathan, N. (2013), “Multiple Identifications in Multi-Armed Bandits,” in International Conference on Machine Learning, Atlanta, Georgia.
Google Scholar
Burns, C., Thomason, J., and Tansey, W. (2020), “Interpreting Black Box Models Via Hypothesis Testing,” in Proceedings of the 2020 ACM-IMS on Foundations of Data Science, 47–57.
Google Scholar
Candes, E., Fan, Y., Janson, L., and Lv, J. (2018), “Panning for Gold: Model-Knockoffs for High Dimensional Controlled Variable Selection,” Journal of the Royal Statistical Society, Series B, 80 , 551–577. DOI: 10.1111/rssb.12265.
Google Scholar
Carbonetto, P., Stephens, M., (2012), “Scalable Variational Inference for Bayesian Variable Selection in Regression, and Its Accuracy in Genetic Association Studies,” Bayesian Analysis , 7 , 73–108. DOI: 10.1214/12-BA703.
Web of Science ®Google Scholar
Carvalho, C. M., Polson, N. G., and Scott, J. G. (2010), “The Horseshoe Estimator for Sparse Signals,” Biometrika , 97 , 465–480. DOI: 10.1093/biomet/asq017.
Web of Science ®Google Scholar
Castillo, I., Schmidt-Hieber, J., and Van der Vaart, A. (2015), “Bayesian Linear Regression With Sparse Priors,” The Annals of Statistics , 43 , 1986–2018. DOI: 10.1214/15-AOS1334.
Web of Science ®Google Scholar
Cesa-Bianchi, N., and Lugosi, G. (2012), “Combinatorial Bandits,” Journal of Computer and System Sciences , 78 , 1404–1422. DOI: 10.1016/j.jcss.2012.01.001.
Web of Science ®Google Scholar
Chen, W., Wang, Y., and Y. Yuan (2013), “Combinatorial Multi-Armed Bandit: General Framework and Applications,” in International Conference on Machine Learning.
Google Scholar
Chipman, H., George, E. I., and McCulloch, R. E. (2001), “The Practical Implementation of Bayesian Model Selection,” in Institute of Mathematical Statistics Lecture Notes –Monograph Series. Institute of Mathematical Statistics, 38, pp. 65–116.
Google Scholar
Chipman, H. A., George, E. I., and McCulloch, R. E. (2010), “BART: Bayesian Additive Regression Trees,” The Annals of Applied Statistics , 4 , 266–298. DOI: 10.1214/09-AOAS285.
Web of Science ®Google Scholar
Combes, R., and Proutiere, A. (2014), “Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms,” in International Conference on Machine Learning, Beijing, China.
Google Scholar
Even-Dar, E., Mannor, S., and Mansour, Y. (2006), “Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems,” Journal of Machine Learning Research , 7, 1079–1105.
Web of Science ®Google Scholar
Fahy, C., and Yang, S. (2019), Dynamic Feature Selection for Clustering High Dimensional Data Streams (Vol. 7), IEEE.
Google Scholar
Fan, J., and Lv, J. (2008), “Sure Independence Screening for Ultrahigh Dimensional Feature Space,” Journal of the Royal Statistical Society, Series B, 70 , 849–911. DOI: 10.1111/j.1467-9868.2008.00674.x.
PubMed Web of Science ®Google Scholar
Fisher, A., Rudin, C., and Dominici, F. (2019), “All Models Are Wrong, But Many Are Useful: Learning a Variables Importance by Studying an Entire Class of Prediction Models Simultaneously,” Journal of Machine Learning Research 20, 1–81.
Web of Science ®Google Scholar
Foster, D. P., and Stine, R. A. (2008), “α-Investing: A Procedure for Sequential Control of Expected False Discoveries,” Journal of the Royal Statistical Society , Series B, 70 , 429–444.
Google Scholar
Friedman, J., Hastie, T., and Tibshirani, R. (2001), The Elements of Statistical Learning (Vol. 1), Springer Series in Statistics, New York: Springer.
Google Scholar
Friedman, J. H. (1991), “Multivariate Adaptive Regression Splines,” The Annals of Statistics , 19 , 1–141. DOI: 10.1214/aos/1176347963.
Web of Science ®Google Scholar
Gai, Y., Krishnamachari, B., and Jain, R. (2012), “Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations,” IEEE/ACM Transactions on Networking , 20 , 1466–1478. DOI: 10.1109/TNET.2011.2181864.
Web of Science ®Google Scholar
Garson, G. D. (1991). “A Comparison of Neural Network and Expert Systems Algorithms With Common Multivariate Procedures for Analysis of Social Science Data,” Social Science Computer Review , 9, 399–434. DOI: 10.1177/089443939100900304.
Google Scholar
George, E. I. and R. E. McCulloch (1993), “Variable Selection Via Gibbs Sampling,” Journal of the American Statistical Association , 88 , 881–889. DOI: 10.1080/01621459.1993.10476353.
Web of Science ®Google Scholar
George, E. I. and R. E. McCulloch (1997), “Approaches for Bayesian Variable Selection,” Statistica Sinica , 7, 339–373.
Web of Science ®Google Scholar
Gupta, S., Joshi, G., and Yagan, O. (2020), “Correlated Multi-Armed Bandits With a Latent Random Source,” in IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, Spain.
Google Scholar
Hill, J., Linero, A., and Murray, J. (2020), “Bayesian Additive Regression Trees: A Review and Look Forward,” Annual Review of Statistics and Its Application 7, 251–278. DOI: 10.1146/annurev-statistics-031219-041110.
Web of Science ®Google Scholar
Hooker, G. (2007), “Generalized Functional ANOVA Diagnostics for High-Dimensional Functions of Dependent Variables,” Journal of Computational and Graphical Statistics , 16 , 709–732. DOI: 10.1198/106186007X237892.
Web of Science ®Google Scholar
Horel, E., and Giesecke, K. (2019), “Towards Explainable AI: Significance Tests for Neural Networks,” arXiv:1902.06021.
Google Scholar
Ishwaran, H. (2007), “Variable Importance in Binary Regression Trees and Forests,” Electronic Journal of Statistics , 1, 519–537. DOI: 10.1214/07-EJS039.
Web of Science ®Google Scholar
Javanmard, A., Montanari, A. (2018), “Online Rules for Control of False Discovery Rate and False Discovery Exceedance,” The Annals of Statistics , 46 , 526–554. DOI: 10.1214/17-AOS1559.
Web of Science ®Google Scholar
Johnson, V. E., and Rossell, D. (2012), “Bayesian Model Selection in High-Dimensional Settings,” Journal of the American Statistical Association , 107 , 649–660. DOI: 10.1080/01621459.2012.682536.
Web of Science ®Google Scholar
Kazemitabar, J., Amini, A., Bloniarz, A., and Talwalkar, A. S. (2017), “Variable Importance Using Decision Trees,” in Advances in Neural Information Processing Systems, Long Beach, CA, USA.
Google Scholar
Komiyama, J., J. Honda, and H. Nakagawa (2015), “Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-Armed Bandit Problem With Multiple Plays,” in International Conference on Machine Learning.
Google Scholar
Kveton, B., Wen, Z., Ashkan, A., Eydgahi, H., and Eriksson, B. (2014), “Matroid Bandits: Fast Combinatorial Optimization With Learning,” in Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pp. 420–429.
Google Scholar
Kveton, B., Wen, Z., Ashkan, A., and Szepesvari, C. (2015), “Combinatorial Cascading Bandits,” Advances in Neural Information Processing Systems, 28, 1450–1458
Google Scholar
Lafferty, J., and Wasserman, L. (2008), “RODEO: Sparse, Greedy Nonparametric Regression,” The Annals of Statistics , 6, 28–63. DOI: 10.1214/009053607000000811.
Web of Science ®Google Scholar
Lai, T., and Robbins, H. (1985), “Asymptotically Efficient Adaptive Allocation Rules,” Advances in Applied Mathematics , 6, 4–22. DOI: 10.1016/0196-8858(85)90002-8.
Web of Science ®Google Scholar
Lei, J., GSell, M., Rinaldo, A., Tibshirani, R. J., and Wasserman, L. (2018), “Distribution-Free Predictive Inference for Regression,” Journal of the American Statistical Association , 113 , 1094–1111. DOI: 10.1080/01621459.2017.1307116.
Web of Science ®Google Scholar
Leike, J., Lattimore, T., Orseau, L., and Hutter, M. (2016), “Thompson Sampling is Asymptotically Optimal in General Environments,” in Conference on Uncertainty in Artificial Intelligence. Virginia, United States: AUAI Press.
Google Scholar
Li, R., Zhong, W., and Zhu, L. (2012), “Feature Screening Via Distance Correlation Learning,” Journal of the American Statistical Association , 107, 1129–1139. DOI: 10.1080/01621459.2012.695654.
PubMed Web of Science ®Google Scholar
Liang, F., Li, Q., and Zhou, L. (2018), “Bayesian Neural Networks for Selection of Drug Sensitive Genes,” Journal of the American Statistical Association , 113 , 955–972. DOI: 10.1080/01621459.2017.1409122.
PubMed Web of Science ®Google Scholar
Lin, Y., and Zhang, H. H. (2006), “Component Selection and Smoothing in Multivariate Nonparametric Regression,” The Annals of Statistics , 34, 2272–2297. DOI: 10.1214/009053606000000722.
Web of Science ®Google Scholar
Linero, A. R. (2018), “Bayesian Regression Trees for High-Dimensional Prediction and Variable Selection,” Journal of the American Statistical Association, 113(522), 626–636. DOI: 10.1080/01621459.2016.1264957.
Web of Science ®Google Scholar
Linero, A. R., and Yang, Y. (2018), “Bayesian Regression Tree Ensembles That Adapt to Smoothness and Sparsity,” Journal of the Royal Statistical Society, Series B, 80 , 1087–1110. DOI: 10.1111/rssb.12293.
Google Scholar
Liu, Y., Ročková, V., and Wang, Y. (2018), “ABC Variable Selection With Bayesian Forests,” arXiv:1806.02304.
Google Scholar
Louppe, G., Wehenkel, L., Sutera, A., and Geurts, P. (2013), “Understanding Variable Importance in Forests of Randomized Trees,” Advances in Neural Information Processing Systems, 1, 431–439.
Google Scholar
Lu, Y., Fan, Y., Lv, J., and Noble, W. S. (2018), “DeepPINK: Reproducible Feature Selection in Deep Neural Networks,” Proceedings of the 32nd International Conference in Neural Information Processing Systems, pp. 8690–8700.
Google Scholar
Mase, M., Owen, A. B., and Seiler, B. (2019), “Explaining Black Box Decisions by Shapley Cohort Refinement,” arXiv:1911.00467.
Google Scholar
Mitchell, T. J., and Beauchamp, J. J. (1988), “Bayesian Variable Selection in Linear Regression,” Journal of the American Statistical Association , 83 , 1023–1032. DOI: 10.1080/01621459.1988.10478694.
Web of Science ®Google Scholar
Narisetty, N. N., He, X. (2014), “Bayesian Variable Selection With Shrinking and Diffusing Priors,” The Annals of Statistics , 42 , 789–817. DOI: 10.1214/14-AOS1207.
Web of Science ®Google Scholar
Ni, J., Neslin, S. A., and Sun, B. (2012), “Database submission—The ISMS durable goods data sets,” Marketing Science, 31(6), 1008–1013. DOI: 10.1287/mksc.1120.0726.
Web of Science ®Google Scholar
Olden, J. D., and Jackson, D. A. (2002), “Illuminating the Lack Box: A Randomization Approach for Understanding Variable Contributions in Artificial Neural Networks,” Ecological Modelling , 154 , 135– 150. DOI: 10.1016/S0304-3800(02)00064-9.
Web of Science ®Google Scholar
Owen, A. B., and Prieur, C. (2017), “On Shapley Value for Measuring Importance of Dependent Inputs,” SIAM/ASA Journal on Uncertainty Quantification , 5 , 986–1002. DOI: 10.1137/16M1097717.
Web of Science ®Google Scholar
Pandey, S., Chakrabarti, D., and Agarwal, D. (2007), “Multi-Armed Bandit Problems With Dependent Arms,” in International Conference on Machine Learning, Corvallis, OR. DOI: 10.1145/1273496.1273587.
Google Scholar
Patterson, E., and Sesia, M. (2018), Knockoff: The Knockoff Filter for Controlled Variable Selection, Statistics Department, Stanford University. R package version 0.3.2.
Google Scholar
Radchenko, P., and James, G. M. (2010), “Variable Selection Using Adaptive Nonlinear Interaction Structures in High Dimensions,” Journal of the American Statistical Association , 105 , 1541–1553. DOI: 10.1198/jasa.2010.tm10130.
Web of Science ®Google Scholar
Ramdas, A., Yang, F., Wainwright, M. J., and Jordan, M. I. (2017), “Online Control of the False Discovery Rate With Decaying Memory,” in 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, eds. I. Guyon, U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan and R. Garnett.
Google Scholar
Ravikumar, P., Lafferty, J., Liu, H., and Wasserman, L. (2009), “Sparse Additive Models,” Journal of the Royal Statistical Society, Series B, 71 , 1009–1030. DOI: 10.1111/j.1467-9868.2009.00718.x.
Google Scholar
Rhee, S.-Y., Fessel, W. J., Zolopa, A. R., Hurley, L., Liu, T., Taylor, J., Nguyen, D. P., Slome, S., Klein, D., Horberg, M., Flamm, J., Follansbee, S., Schapiro, J. M., and Shafer, R. W. (2005), “HIV-1 Protease and Reverse-Transcriptase Mutations: Correlations With Antiretroviral Therapy in Subtype b Isolates and Implications for Drug-Resistance Surveillance,” The Journal of Infectious Diseases , 192 , 456–465. DOI: 10.1086/431601.
PubMed Web of Science ®Google Scholar
Rhee, S.-Y., Taylor, J., Wadhera, G., Ben-Hur, A., Brutlag, D. L., and Shafer, R. W. (2006), “Genotypic Predictors of Human Immunodeficiency Virus Type 1 Drug Resistance,” Proceedings of the National Academy of Sciences , 103 , 17355–17360. DOI: 10.1073/pnas.0607274103.
Google Scholar
Ročková, V., and George, E. I. (2014), “EMVS: The EM Approach to Bayesian Variable Selection,” Journal of the American Statistical Association , 109 , 828–846. DOI: 10.1080/01621459.2013.869223.
Web of Science ®Google Scholar
Ročková, V., and George, E. I. (2018), “The Spike-and-Slab LASSO,” Journal of the American Statistical Association , 113 , 431–444.
Web of Science ®Google Scholar
Rossell, D., and Telesca, D. (2017). Nonlocal priors for high-dimensional estimation. Journal of the American Statistical Association , 112 , 254–265. DOI: 10.1080/01621459.2015.1130634.
PubMed Web of Science ®Google Scholar
Russo, D. (2016), “Simple Bayesian Algorithms for Best Arm Identification,” in Conference on Learning Theory, New York, USA.
Google Scholar
Scheipl, F. (2011), “spikeslabgam: Bayesian Variable Selection, Model Choice and Regularization for Generalized Additive Mixed Models in r,” arXiv:1105.5253 .
Google Scholar
Shapley, L. S. (1953), “A Value for n-Person Games,” Contributions to the Theory of Games , 2, 307–317.
Google Scholar
Thompson, W. R. (1933), “On the Likelihood That One Unknown Probability Exceeds Another in View of the Evidence of Two Sample,” Biometrika , 25, 285–294. DOI: 10.1093/biomet/25.3-4.285.
Google Scholar
Tibshirani, R. (2011), “Regression Shrinkage and Selection Via the LASSO: A Retrospective,” Journal of the Royal Statistical Society, 73 , 273–282. DOI: 10.1111/j.1467-9868.2011.00771.x.
Google Scholar
van der Pas, S., Scott, J., Chakraborty, A., and Bhattacharya, A. (2019), horseshoe: Implementation of the Horseshoe Prior (R package version 0.2.0).
Google Scholar
van der Pas, S., Szabó, B., van der Vaart, A. (2017), “Uncertainty Quantification for the Horseshoe” (with discussion), Bayesian Analysis , 12 , 1221–1274. DOI: 10.1214/17-BA1065.
Web of Science ®Google Scholar
Vannucci, M., and Stingo, F. C. (2010), “Bayesian Models for Variable Selection That Incorporate Biological Information,” Bayesian Statistics , 9, 1–20.
Google Scholar
Wang, J., Shen, J., and Li, P. (2018), “Provable Variable Selection for Streaming Features,” in International Conference on Machine Learning, Stockholm, Sweden.
Google Scholar
Wang, S., and Chen, W. (2018), “Thompson Sampling for Combinatorial Semi-Bandits,” in International Conference on Machine Learning, pp. 5114–5122.
Google Scholar
Ye, M., and Sun, Y. (2018), “Variable Selection Via Penalized Neural Network: A Drop-Out-One Loss Approach,” in International Conference on Machine Learning, Stockholm, Sweden.
Google Scholar
Zhang, T., Ge, S. S., and Hang, C. C. (2000), “Adaptive Neural Network Control for Strict-Feedback Nonlinear Systems Using Backstepping Design,” Automatica , 36 , 1835–1846. DOI: 10.1016/S0005-1098(00)00116-3.
Web of Science ®Google Scholar
Zhou, J., Foster, D. P., Stine, R. A., and Ungar, L. H. (2006), “Streamwise Feature Selection,” Journal of Machine Learning Research , 7, 1861–1885.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Variable Selection Via Thompson Sampling

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Variable Selection Via Thompson Sampling

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date