137
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Hybrid Parameter Search and Dynamic Model Selection for Mixed-Variable Bayesian Optimization

ORCID Icon, , , & ORCID Icon
Received 30 Oct 2022, Accepted 15 Jan 2024, Published online: 08 Mar 2024

References

  • Abdessalem, A. B., Dervilis, N., Wagg, D. J., and Worden, K. (2017), “Automatic Kernel Selection for Gaussian Processes Regression with Approximate Bayesian Computation and Sequential Monte Carlo,” Frontiers in Built Environment, 3, 52. DOI: 10.3389/fbuil.2017.00052.
  • Auer, P. (2002), “Using Confidence Bounds for Exploitation-Exploration Trade-Offs,” Journal of Machine Learning Research, 3, 397–422.
  • Balaprakash, P., Dongarra, J., Gamblin, T., Hall, M., Hollingsworth, J. K., Norris, B., and Vuduc, R. (2018), “Autotuning in High-Performance Computing Applications,” Proceedings of the IEEE, 106, 2068–2083. DOI: 10.1109/JPROC.2018.2841200.
  • Bergstra, J., Bardenet, R., Bengio, Y., and Kégl, B. (2011), “Algorithms for Hyper-Parameter Optimization,” in Advances in Neural Information Processing Systems (Vol. 24).
  • Biau, G., and Cadre, B. (2021), “Optimization by Gradient Boosting,” in Advances in Contemporary Statistics and Econometrics, eds. A. Daouia, A. Ruiz-Gazen, pp. 23–44, Cham: Springer.
  • Bitzer, M., Meister, M., and Zimmer, C. (2022), “Structural Kernel Search via Bayesian Optimization and Symbolical Optimal Transport,” in Advances in Neural Information Processing Systems (Vol. 35), pp. 39047–39058.
  • Bliek, L., Verwer, S., and de Weerdt, M. (2021), “Black-box Combinatorial Optimization using Models with Integer-valued Minima,” Annals of Mathematics and Artificial Intelligence, 89, 639–653. DOI: 10.1007/s10472-020-09712-4.
  • Cerda, P., Varoquaux, G., and Kégl, B. (2018), “Similarity Encoding for Learning with Dirty Categorical Variables,” Machine Learning, 107, 1477–1494. DOI: 10.1007/s10994-018-5724-2.
  • Chipman, H. A., George, E. I., and McCulloch, R. E. (2010), “BART: Bayesian Additive Regression Trees,” The Annals of Applied Statistics, 4, 266–298. DOI: 10.1214/09-AOAS285.
  • Cho, Y., and Saul, L. (2009), “Kernel Methods for Deep Learning,” in Advances in Neural Information Processing Systems (Vol. 22).
  • Deng, X., Lin, C. D., Liu, K.-W., and Rowe, R. K. (2017), “Additive Gaussian Process for Computer Models With Qualitative and Quantitative Factors,” Technometrics, 59, 283–292. DOI: 10.1080/00401706.2016.1211554.
  • Deshwal, A., Belakaria, S., and Doppa, J. R. (2021), “Bayesian Optimization over Hybrid Spaces,” arXiv preprint arXiv:2106.04682.
  • Elsken, T., Metzen, J. H., and Hutter, F. (2019), “Neural Architecture Search: A Survey,” The Journal of Machine Learning Research, 20, 1997–2017.
  • Fouché, E., Komiyama, J., and Böhm, K. (2019), “Scaling Multi-Armed Bandit Algorithms,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1449–1459. DOI: 10.1145/3292500.3330862.
  • Friedman, J. H. (1991), “Multivariate Adaptive Regression Splines,” The Annals of Statistics, 19, 1–67. DOI: 10.1214/aos/1176347963.
  • Garrido-Merchán, E. C., and Hernández-Lobato, D. (2020), “Dealing with Categorical and Integer-valued Variables in Bayesian Optimization with Gaussian Processes,” Neurocomputing, 380, 20–35. DOI: 10.1016/j.neucom.2019.11.004.
  • Ghysels, P., Chávez, G., Guo, L., Gorman, C., Li, X. S., Liu, Y., Rebrova, L., Rouet, F.-H., Mary, T., and Actor, J. (2017), “STRUMPACK.”
  • Ghysels, P., Li, X. S., Rouet, F.-H., Williams, S., and Napov, A. (2016), “An Efficient Multicore Implementation of a Novel HSS-structured Multifrontal Solver using Randomized Sampling,” SIAM Journal on Scientific Computing, 38, S358–S384. DOI: 10.1137/15M1010117.
  • Gopakumar, S., Gupta, S., Rana, S., Nguyen, V., and Venkatesh, S. (2018), “Algorithmic Assurance: An Active Approach to Algorithmic Testing Using Bayesian Optimisation,” in Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 5470–5478.
  • Gramacy, R. B. (2020), Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences, Boca Raton, FL: CRC Press.
  • Hannan, E. J., and Quinn, B. G. (1979), “The Determination of the Order of an Autoregression,” Journal of the Royal Statistical Society, Series B, 41, 190–195. DOI: 10.1111/j.2517-6161.1979.tb01072.x.
  • Harrison Jr, D., and Rubinfeld, D. L. (1978), “Hedonic Housing Prices and the Demand for Clean Air,” Journal of Environmental Economics and Management, 5, 81–102. DOI: 10.1016/0095-0696(78)90006-2.
  • Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., and Shcherbatyi, I. (2020), “Scikit-Optimize/Scikit-Optimize,” Zenodo.
  • Hermans, M., and Schrauwen, B. (2012), “Recurrent Kernel Machines: Computing with Infinite Echo State Networks,” Neural Computation, 24, 104–133. DOI: 10.1162/NECO_a_00200.
  • Hutter, F., Hoos, H. H., and Leyton-Brown, K. (2011), “Sequential Model-Based Optimization for General Algorithm Configuration,” in International Conference on Learning and Intelligent Optimization, pp. 507–523, Springer.
  • Jones, D. R., Schonlau, M., and Welch, W. J. (1998), “Efficient Global Optimization of Expensive Black-Box Functions,” Journal of Global Optimization, 13, 455–492. DOI: 10.1023/A:1008306431147.
  • Karlsson, R., Bliek, L., Verwer, S., and de Weerdt, M. (2020), “Continuous Surrogate-based Optimization Algorithms are Well-suited for Expensive Discrete Problems,” in Benelux Conference on Artificial Intelligence, pp. 48–63, Springer.
  • Kocsis, L., and Szepesvári, C. (2006), “Bandit Based Monte-Carlo Planning,” in Machine Learning: ECML 2006 (Vol. 4212), eds. D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, J. Fürnkranz, T. Scheffer, and M. Spiliopoulou, pp. 282–293, Berlin, Heidelberg: Springer.
  • Liao, Y.-T., Luo, H., and Ma, A. (2023), “Efficient and Robust Bayesian Selection of Hyperparameters in Dimension Reduction for Visualization,” arXiv preprint arXiv:2306.00357.
  • Luo, H., Demmel, J. W., Cho, Y., Li, X. S., and Liu, Y. (2021), “Non-smooth Bayesian Optimization in Tuning Problems,” arXiv preprint arXiv:2109.07563.
  • Luo, H., and Pratola, M. T. (2022), “Sharded Bayesian Additive Regression Trees,” arXiv:2306.00361, pp. 1–46.
  • Luo, H., and Zhu, Y. (2023), “Optimism and Model Complexity Measure for Linear Models,” in preparation.
  • Lykouris, T., Mirrokni, V., and Leme, R. P. (2020), “Bandits with Adversarial Scaling,” in International Conference on Machine Learning, pp. 6511–6521, PMLR.
  • Malkomes, G., Schaff, C., and Garnett, R. (2016), “Bayesian Optimization for Automated Model Selection,” in Advances in Neural Information Processing Systems (Vol. 29), eds. D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, Curran Associates, Inc.
  • Nguyen, D., Gupta, S., Rana, S., Shilton, A., and Venkatesh, S. (2019), “Bayesian Optimization‘for Categorical and Category-Specific Continuous Inputs,” arXiv:1911.12473 [cs, stat].
  • Oh, C., Gavves, E., and Welling, M. (2021), “Mixed Variable Bayesian Optimization with Frequency Modulated Kernels,” arXiv preprint arXiv:2102.12792.
  • Olson, R. S., and Moore, J. H. (2019), “TPOT: A Tree-Based Pipeline Optimization Tool for Automating Machine Learning,” in Automated Machine Learning, eds. F. Hutter, L. Kotthoff, and J. Vanschoren, pp. 151–160, Cham: Springer.
  • Rakotoarison, H., Schoenauer, M., and Sebag, M. (2019), “Automated Machine Learning with Monte-Carlo Tree Search,” arXiv:1906.00170 [cs, stat].
  • Rasmussen, C. E., and Williams, C. K. I. (2006), Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning, Cambridge, MA: MIT Press.
  • Ru, B., Alvi, A. S., Nguyen, V., Osborne, M. A., and Roberts, S. J. (2020), “Bayesian Optimisation over Multiple Continuous and Categorical Inputs,” arXiv:1906.08878 [cs, stat].
  • Sethuraman, J., and Tiwari, R. C. (1982), “Convergence of Dirichlet Measures and the Interpretation of Their Parameter,” in Statistical Decision Theory and Related Topics III, eds. S. S. Gupta, and J. O. Berger, pp. 305–315, New York: Elsevier.
  • Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., and de Freitas, N. (2016), “Taking the Human Out of the Loop: A Review of Bayesian Optimization,” Proceedings of the IEEE, 104, 148–175. DOI: 10.1109/JPROC.2015.2494218.
  • Sid-Lakhdar, W. M., Aznaveh, M. M., Li, X. S., and Demmel, J. W. (2019), “Multitask and Transfer Learning for Autotuning Exascale Applications,” arXiv:1908.05792 [cs, stat].
  • Sid-Lakhdar, W. M., Cho, Y., Demmel, J. W., Luo, H., Li, X. S., Liu, Y., and Marques, O. (2020), “GPTune User Guide.”
  • Skelton, R. E., Whitaker, K. E., Momcheva, I. G., Brammer, G. B., Van Dokkum, P. G., Labbé, I., Franx, M., Van Der Wel, A., Bezanson, R., Da Cunha, E., et al. (2014), “3d-hst wfc3-Selected Photometric Catalogs in the Five Candels/3d-hst Fields: Photometry, Photometric Redshifts, and Stellar Masses,” The Astrophysical Journal Supplement Series, 214, 24. DOI: 10.1088/0067-0049/214/2/24.
  • Snoek, J., Larochelle, H., and Adams, R. P. (2012), “Practical Bayesian Optimization of Machine Learning Algorithms,” arXiv:1206.2944 [cs, stat].
  • Surjanovic, S., and Bingham, D. (2022), “Virtual Library of Simulation Experiments: Test Functions and Datasets,” available at http://www.sfu.ca/∼ssurjano.
  • Swersky, K., Duvenaud, D., Snoek, J., Hutter, F., and Osborne, M. A. (2014), “Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces,” arXiv preprint arXiv:1409.4011.
  • Tesauro, G., Rajan, V., and Segal, R. (2012), “Bayesian Inference in Monte-Carlo Tree Search,” arXiv preprint arXiv:1203.3519.
  • Willemsen, F.-J., van Nieuwpoort, R., and van Werkhoven, B. (2021), “Bayesian Optimization for Auto-Tuning GPU Kernels,” arXiv:2111.14991 [cs, math].
  • Xiao, Q., Mandal, A., Lin, C. D., and Deng, X. (2021), “Ezgp: Easy-to-Interpret Gaussian Process Models for Computer Experiments with both Quantitative and Qualitative Factors’,” SIAM/ASA Journal on Uncertainty Quantification, 9, 333–353. DOI: 10.1137/19M1288462.
  • Ye, J. (1998), “On Measuring and Correcting the Effects of Data Mining and Model Selection,” Journal of the American Statistical Association, 93, 120–131. DOI: 10.1080/01621459.1998.10474094.
  • Zeng, X., and Luo G. (2017), “Progressive Sampling-based Bayesian Optimization for Efficient and Automatic Machine Learning Model Selection,” Health Information Science and Systems, 5, 1–21. DOI: 10.1007/s13755-017-0023-z.
  • Zhang, Y., Apley, D. W., and Chen, W. (2020a), “Bayesian Optimization for Materials Design with Mixed Quantitative and Qualitative Variables,” Scientific Reports, 10, 4924. DOI: 10.1038/s41598-020-60652-9.
  • Zhang, Y., Tao, S., Chen, W., and Apley, D. W. (2020b), “A Latent Variable Approach to Gaussian Process Modeling with Qualitative and Quantitative Factors,” Technometrics, 62, 291–302. DOI: 10.1080/00401706.2019.1638834.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.