1,886
Views
5
CrossRef citations to date
0
Altmetric
Theory and Methods

Variable Selection Via Thompson Sampling

&
Pages 287-304 | Received 25 Jun 2020, Accepted 23 Mar 2021, Published online: 06 Jul 2021

References

  • Agrawal, S. and Goyal, N. (2012), “Analysis of Thompson Sampling for the Multi-Armed Bandit Problem,” in COLT 2012: 25th Annual Conference on Learning Theory in Edinburgh.
  • Barber, R. F., Candès, E. J. (2015), “Controlling the False Discovery Rate Via Knockoffs,” The Annals of Statistics , 43, 2055–2085. DOI: 10.1214/15-AOS1337.
  • Barbieri, M., Berger, J. O., George, E. I., and Ročková, V. (2020), “The Median Probability Model and Correlated Variables,” Bayesian Analysis (to appear). DOI: 10.1214/20-BA1249.
  • Barbieri, M. M., and Berger, J. O. (2004), “Optimal Predictive Model Selection,” Annals of Statistics , 32 , 870–897.
  • Bhattacharya, A., Chakraborty, A., and Mallick, B. K. (2016), “Fast Sampling With Gaussian Scale Mixture Priors in High-Dimensional Regression,” Biometrika , 103 , 985. DOI: 10.1093/biomet/asw042.
  • Bleich, J., Kapelner, A., George, E., and Jensen, S. (2014), “Variable Selection for BART: An Application to Gene Regulation,” Annals of Applied Statistics , 8, 1750–1781.
  • Bottolo, L., and Richardson, S. (2010), “Evolutionary Stochastic Search for Bayesian Model Exploration,” Bayesian Analysis , 5 , 583–618. DOI: 10.1214/10-BA523.
  • Breiman, L. (2001), “Random Forests,” Machine Learning , 45, 5–32. DOI: 10.1023/A:1010933404324.
  • Brown, P. J., Vannucci, M., and Fearn, T. (1998), “Multivariate Bayesian Variable Selection and Prediction,” Journal of the Royal Statistical Society, Series B, 60 , 627–641. DOI: 10.1111/1467-9868.00144.
  • Bubeck, S., Munos, R., and Stoltz, G. (2009), “Pure Exploration in Multi-Armed Bandits Problems,” in International Conference on Algorithmic Learning Theory, Porto, Portugal.
  • Bubeck, S., Wang, T., and Viswanathan, N. (2013), “Multiple Identifications in Multi-Armed Bandits,” in International Conference on Machine Learning, Atlanta, Georgia.
  • Burns, C., Thomason, J., and Tansey, W. (2020), “Interpreting Black Box Models Via Hypothesis Testing,” in Proceedings of the 2020 ACM-IMS on Foundations of Data Science, 47–57.
  • Candes, E., Fan, Y., Janson, L., and Lv, J. (2018), “Panning for Gold: Model-Knockoffs for High Dimensional Controlled Variable Selection,” Journal of the Royal Statistical Society, Series B, 80 , 551–577. DOI: 10.1111/rssb.12265.
  • Carbonetto, P., Stephens, M., (2012), “Scalable Variational Inference for Bayesian Variable Selection in Regression, and Its Accuracy in Genetic Association Studies,” Bayesian Analysis , 7 , 73–108. DOI: 10.1214/12-BA703.
  • Carvalho, C. M., Polson, N. G., and Scott, J. G. (2010), “The Horseshoe Estimator for Sparse Signals,” Biometrika , 97 , 465–480. DOI: 10.1093/biomet/asq017.
  • Castillo, I., Schmidt-Hieber, J., and Van der Vaart, A. (2015), “Bayesian Linear Regression With Sparse Priors,” The Annals of Statistics , 43 , 1986–2018. DOI: 10.1214/15-AOS1334.
  • Cesa-Bianchi, N., and Lugosi, G. (2012), “Combinatorial Bandits,” Journal of Computer and System Sciences , 78 , 1404–1422. DOI: 10.1016/j.jcss.2012.01.001.
  • Chen, W., Wang, Y., and Y. Yuan (2013), “Combinatorial Multi-Armed Bandit: General Framework and Applications,” in International Conference on Machine Learning.
  • Chipman, H., George, E. I., and McCulloch, R. E. (2001), “The Practical Implementation of Bayesian Model Selection,” in Institute of Mathematical Statistics Lecture Notes –Monograph Series. Institute of Mathematical Statistics, 38, pp. 65–116.
  • Chipman, H. A., George, E. I., and McCulloch, R. E. (2010), “BART: Bayesian Additive Regression Trees,” The Annals of Applied Statistics , 4 , 266–298. DOI: 10.1214/09-AOAS285.
  • Combes, R., and Proutiere, A. (2014), “Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms,” in International Conference on Machine Learning, Beijing, China.
  • Even-Dar, E., Mannor, S., and Mansour, Y. (2006), “Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems,” Journal of Machine Learning Research , 7, 1079–1105.
  • Fahy, C., and Yang, S. (2019), Dynamic Feature Selection for Clustering High Dimensional Data Streams (Vol. 7), IEEE.
  • Fan, J., and Lv, J. (2008), “Sure Independence Screening for Ultrahigh Dimensional Feature Space,” Journal of the Royal Statistical Society, Series B, 70 , 849–911. DOI: 10.1111/j.1467-9868.2008.00674.x.
  • Fisher, A., Rudin, C., and Dominici, F. (2019), “All Models Are Wrong, But Many Are Useful: Learning a Variables Importance by Studying an Entire Class of Prediction Models Simultaneously,” Journal of Machine Learning Research 20, 1–81.
  • Foster, D. P., and Stine, R. A. (2008), “α-Investing: A Procedure for Sequential Control of Expected False Discoveries,” Journal of the Royal Statistical Society , Series B, 70 , 429–444.
  • Friedman, J., Hastie, T., and Tibshirani, R. (2001), The Elements of Statistical Learning (Vol. 1), Springer Series in Statistics, New York: Springer.
  • Friedman, J. H. (1991), “Multivariate Adaptive Regression Splines,” The Annals of Statistics , 19 , 1–141. DOI: 10.1214/aos/1176347963.
  • Gai, Y., Krishnamachari, B., and Jain, R. (2012), “Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations,” IEEE/ACM Transactions on Networking , 20 , 1466–1478. DOI: 10.1109/TNET.2011.2181864.
  • Garson, G. D. (1991). “A Comparison of Neural Network and Expert Systems Algorithms With Common Multivariate Procedures for Analysis of Social Science Data,” Social Science Computer Review , 9, 399–434. DOI: 10.1177/089443939100900304.
  • George, E. I. and R. E. McCulloch (1993), “Variable Selection Via Gibbs Sampling,” Journal of the American Statistical Association , 88 , 881–889. DOI: 10.1080/01621459.1993.10476353.
  • George, E. I. and R. E. McCulloch (1997), “Approaches for Bayesian Variable Selection,” Statistica Sinica , 7, 339–373.
  • Gupta, S., Joshi, G., and Yagan, O. (2020), “Correlated Multi-Armed Bandits With a Latent Random Source,” in IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, Spain.
  • Hill, J., Linero, A., and Murray, J. (2020), “Bayesian Additive Regression Trees: A Review and Look Forward,” Annual Review of Statistics and Its Application 7, 251–278. DOI: 10.1146/annurev-statistics-031219-041110.
  • Hooker, G. (2007), “Generalized Functional ANOVA Diagnostics for High-Dimensional Functions of Dependent Variables,” Journal of Computational and Graphical Statistics , 16 , 709–732. DOI: 10.1198/106186007X237892.
  • Horel, E., and Giesecke, K. (2019), “Towards Explainable AI: Significance Tests for Neural Networks,” arXiv:1902.06021.
  • Ishwaran, H. (2007), “Variable Importance in Binary Regression Trees and Forests,” Electronic Journal of Statistics , 1, 519–537. DOI: 10.1214/07-EJS039.
  • Javanmard, A., Montanari, A. (2018), “Online Rules for Control of False Discovery Rate and False Discovery Exceedance,” The Annals of Statistics , 46 , 526–554. DOI: 10.1214/17-AOS1559.
  • Johnson, V. E., and Rossell, D. (2012), “Bayesian Model Selection in High-Dimensional Settings,” Journal of the American Statistical Association , 107 , 649–660. DOI: 10.1080/01621459.2012.682536.
  • Kazemitabar, J., Amini, A., Bloniarz, A., and Talwalkar, A. S. (2017), “Variable Importance Using Decision Trees,” in Advances in Neural Information Processing Systems, Long Beach, CA, USA.
  • Komiyama, J., J. Honda, and H. Nakagawa (2015), “Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-Armed Bandit Problem With Multiple Plays,” in International Conference on Machine Learning.
  • Kveton, B., Wen, Z., Ashkan, A., Eydgahi, H., and Eriksson, B. (2014), “Matroid Bandits: Fast Combinatorial Optimization With Learning,” in Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pp. 420–429.
  • Kveton, B., Wen, Z., Ashkan, A., and Szepesvari, C. (2015), “Combinatorial Cascading Bandits,” Advances in Neural Information Processing Systems, 28, 1450–1458
  • Lafferty, J., and Wasserman, L. (2008), “RODEO: Sparse, Greedy Nonparametric Regression,” The Annals of Statistics , 6, 28–63. DOI: 10.1214/009053607000000811.
  • Lai, T., and Robbins, H. (1985), “Asymptotically Efficient Adaptive Allocation Rules,” Advances in Applied Mathematics , 6, 4–22. DOI: 10.1016/0196-8858(85)90002-8.
  • Lei, J., GSell, M., Rinaldo, A., Tibshirani, R. J., and Wasserman, L. (2018), “Distribution-Free Predictive Inference for Regression,” Journal of the American Statistical Association , 113 , 1094–1111. DOI: 10.1080/01621459.2017.1307116.
  • Leike, J., Lattimore, T., Orseau, L., and Hutter, M. (2016), “Thompson Sampling is Asymptotically Optimal in General Environments,” in Conference on Uncertainty in Artificial Intelligence. Virginia, United States: AUAI Press.
  • Li, R., Zhong, W., and Zhu, L. (2012), “Feature Screening Via Distance Correlation Learning,” Journal of the American Statistical Association , 107, 1129–1139. DOI: 10.1080/01621459.2012.695654.
  • Liang, F., Li, Q., and Zhou, L. (2018), “Bayesian Neural Networks for Selection of Drug Sensitive Genes,” Journal of the American Statistical Association , 113 , 955–972. DOI: 10.1080/01621459.2017.1409122.
  • Lin, Y., and Zhang, H. H. (2006), “Component Selection and Smoothing in Multivariate Nonparametric Regression,” The Annals of Statistics , 34, 2272–2297. DOI: 10.1214/009053606000000722.
  • Linero, A. R. (2018), “Bayesian Regression Trees for High-Dimensional Prediction and Variable Selection,” Journal of the American Statistical Association, 113(522), 626–636. DOI: 10.1080/01621459.2016.1264957.
  • Linero, A. R., and Yang, Y. (2018), “Bayesian Regression Tree Ensembles That Adapt to Smoothness and Sparsity,” Journal of the Royal Statistical Society, Series B, 80 , 1087–1110. DOI: 10.1111/rssb.12293.
  • Liu, Y., Ročková, V., and Wang, Y. (2018), “ABC Variable Selection With Bayesian Forests,” arXiv:1806.02304.
  • Louppe, G., Wehenkel, L., Sutera, A., and Geurts, P. (2013), “Understanding Variable Importance in Forests of Randomized Trees,” Advances in Neural Information Processing Systems, 1, 431–439.
  • Lu, Y., Fan, Y., Lv, J., and Noble, W. S. (2018), “DeepPINK: Reproducible Feature Selection in Deep Neural Networks,” Proceedings of the 32nd International Conference in Neural Information Processing Systems, pp. 8690–8700.
  • Mase, M., Owen, A. B., and Seiler, B. (2019), “Explaining Black Box Decisions by Shapley Cohort Refinement,” arXiv:1911.00467.
  • Mitchell, T. J., and Beauchamp, J. J. (1988), “Bayesian Variable Selection in Linear Regression,” Journal of the American Statistical Association , 83 , 1023–1032. DOI: 10.1080/01621459.1988.10478694.
  • Narisetty, N. N., He, X. (2014), “Bayesian Variable Selection With Shrinking and Diffusing Priors,” The Annals of Statistics , 42 , 789–817. DOI: 10.1214/14-AOS1207.
  • Ni, J., Neslin, S. A., and Sun, B. (2012), “Database submission—The ISMS durable goods data sets,” Marketing Science, 31(6), 1008–1013. DOI: 10.1287/mksc.1120.0726.
  • Olden, J. D., and Jackson, D. A. (2002), “Illuminating the Lack Box: A Randomization Approach for Understanding Variable Contributions in Artificial Neural Networks,” Ecological Modelling , 154 , 135– 150. DOI: 10.1016/S0304-3800(02)00064-9.
  • Owen, A. B., and Prieur, C. (2017), “On Shapley Value for Measuring Importance of Dependent Inputs,” SIAM/ASA Journal on Uncertainty Quantification , 5 , 986–1002. DOI: 10.1137/16M1097717.
  • Pandey, S., Chakrabarti, D., and Agarwal, D. (2007), “Multi-Armed Bandit Problems With Dependent Arms,” in International Conference on Machine Learning, Corvallis, OR. DOI: 10.1145/1273496.1273587.
  • Patterson, E., and Sesia, M. (2018), Knockoff: The Knockoff Filter for Controlled Variable Selection, Statistics Department, Stanford University. R package version 0.3.2.
  • Radchenko, P., and James, G. M. (2010), “Variable Selection Using Adaptive Nonlinear Interaction Structures in High Dimensions,” Journal of the American Statistical Association , 105 , 1541–1553. DOI: 10.1198/jasa.2010.tm10130.
  • Ramdas, A., Yang, F., Wainwright, M. J., and Jordan, M. I. (2017), “Online Control of the False Discovery Rate With Decaying Memory,” in 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, eds. I. Guyon, U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan and R. Garnett.
  • Ravikumar, P., Lafferty, J., Liu, H., and Wasserman, L. (2009), “Sparse Additive Models,” Journal of the Royal Statistical Society, Series B, 71 , 1009–1030. DOI: 10.1111/j.1467-9868.2009.00718.x.
  • Rhee, S.-Y., Fessel, W. J., Zolopa, A. R., Hurley, L., Liu, T., Taylor, J., Nguyen, D. P., Slome, S., Klein, D., Horberg, M., Flamm, J., Follansbee, S., Schapiro, J. M., and Shafer, R. W. (2005), “HIV-1 Protease and Reverse-Transcriptase Mutations: Correlations With Antiretroviral Therapy in Subtype b Isolates and Implications for Drug-Resistance Surveillance,” The Journal of Infectious Diseases , 192 , 456–465. DOI: 10.1086/431601.
  • Rhee, S.-Y., Taylor, J., Wadhera, G., Ben-Hur, A., Brutlag, D. L., and Shafer, R. W. (2006), “Genotypic Predictors of Human Immunodeficiency Virus Type 1 Drug Resistance,” Proceedings of the National Academy of Sciences , 103 , 17355–17360. DOI: 10.1073/pnas.0607274103.
  • Ročková, V., and George, E. I. (2014), “EMVS: The EM Approach to Bayesian Variable Selection,” Journal of the American Statistical Association , 109 , 828–846. DOI: 10.1080/01621459.2013.869223.
  • Ročková, V., and George, E. I. (2018), “The Spike-and-Slab LASSO,” Journal of the American Statistical Association , 113 , 431–444.
  • Rossell, D., and Telesca, D. (2017). Nonlocal priors for high-dimensional estimation. Journal of the American Statistical Association , 112 , 254–265. DOI: 10.1080/01621459.2015.1130634.
  • Russo, D. (2016), “Simple Bayesian Algorithms for Best Arm Identification,” in Conference on Learning Theory, New York, USA.
  • Scheipl, F. (2011), “spikeslabgam: Bayesian Variable Selection, Model Choice and Regularization for Generalized Additive Mixed Models in r,” arXiv:1105.5253 .
  • Shapley, L. S. (1953), “A Value for n-Person Games,” Contributions to the Theory of Games , 2, 307–317.
  • Thompson, W. R. (1933), “On the Likelihood That One Unknown Probability Exceeds Another in View of the Evidence of Two Sample,” Biometrika , 25, 285–294. DOI: 10.1093/biomet/25.3-4.285.
  • Tibshirani, R. (2011), “Regression Shrinkage and Selection Via the LASSO: A Retrospective,” Journal of the Royal Statistical Society, 73 , 273–282. DOI: 10.1111/j.1467-9868.2011.00771.x.
  • van der Pas, S., Scott, J., Chakraborty, A., and Bhattacharya, A. (2019), horseshoe: Implementation of the Horseshoe Prior (R package version 0.2.0).
  • van der Pas, S., Szabó, B., van der Vaart, A. (2017), “Uncertainty Quantification for the Horseshoe” (with discussion), Bayesian Analysis , 12 , 1221–1274. DOI: 10.1214/17-BA1065.
  • Vannucci, M., and Stingo, F. C. (2010), “Bayesian Models for Variable Selection That Incorporate Biological Information,” Bayesian Statistics , 9, 1–20.
  • Wang, J., Shen, J., and Li, P. (2018), “Provable Variable Selection for Streaming Features,” in International Conference on Machine Learning, Stockholm, Sweden.
  • Wang, S., and Chen, W. (2018), “Thompson Sampling for Combinatorial Semi-Bandits,” in International Conference on Machine Learning, pp. 5114–5122.
  • Ye, M., and Sun, Y. (2018), “Variable Selection Via Penalized Neural Network: A Drop-Out-One Loss Approach,” in International Conference on Machine Learning, Stockholm, Sweden.
  • Zhang, T., Ge, S. S., and Hang, C. C. (2000), “Adaptive Neural Network Control for Strict-Feedback Nonlinear Systems Using Backstepping Design,” Automatica , 36 , 1835–1846. DOI: 10.1016/S0005-1098(00)00116-3.
  • Zhou, J., Foster, D. P., Stine, R. A., and Ungar, L. H. (2006), “Streamwise Feature Selection,” Journal of Machine Learning Research , 7, 1861–1885.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.