Search in:

Advanced search

IISE Transactions Volume 56, 2024 - Issue 6

Submit an article Journal homepage

194

Views

CrossRef citations to date

Altmetric

Data Science, Quality & Reliability

Optimize to generalize in Gaussian processes: An alternative objective based on the Rényi divergence

Xubo YueIndustrial & Operations Engineering, University of Michigan, Ann Arbor, MI, USAView further author information

Raed Al KontarIndustrial & Operations Engineering, University of Michigan, Ann Arbor, MI, USACorrespondence[email protected]

https://orcid.org/0000-0002-4546-324X View further author information

Pages 600-610 | Received 31 Mar 2023, Accepted 20 Apr 2023, Published online: 13 Jul 2023

Cite this article
https://doi.org/10.1080/24725854.2023.2219468
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Alshraideh, H. and Khatatbeh, E. (2014) A Gaussian process control chart for monitoring autocorrelated process data. Journal of Quality Technology, 46(4), 317–322.
Web of Science ®Google Scholar
Alvarez, M. and Lawrence, N. (2008) Sparse convolved Gaussian processes for multi-output regression. Advances in Neural Information Processing Systems, 21, 57–64.
Google Scholar
Asuncion, A. and Newman, D. J. (2007) UCI machine learning repository, University of California, School of Information and Computer Science, Irvine, CA. Available at http://www.ics.uci.edu/∼mlearn/MLRepository.html
Google Scholar
Bhattacharya, A., Pati, D., Yang, Y. et al. (2019) Bayesian fractional posteriors. Annals of Statistics, 47(1), 39–66.
Web of Science ®Google Scholar
Bishop, C.M. (2006) Pattern Recognition and Machine Learning, Springer, New York, NY.
Google Scholar
Blei, D.M., Kucukelbir, A. and McAuliffe, J.D. (2017) Variational inference: A review for statisticians. Journal of the American Statistical Association, 112(518), 859–877.
Web of Science ®Google Scholar
Bui, T., Hernández-Lobato, D., Hernandez-Lobato, J., Li, Y. and Turner, R. (2016) Deep Gaussian processes for regression using approximate expectation propagation, in International Conference on Machine Learning, Curran Associates, Inc., New York, NY, pp. 1472–1481.
Google Scholar
Bui, T.D., Yan, J. and Turner, R.E. (2017) A unifying framework for Gaussian process pseudo-point approximations using power expectation propagation. The Journal of Machine Learning Research, 18(1), 3649–3720.
Google Scholar
Burt, D.R., Rasmussen, C.E. and Van Der Wilk, M. (2019) Rates of convergence for sparse variational Gaussian process regression. arXiv preprint arXiv:1903.03571.
Google Scholar
Chen, H., Zheng, L., Al Kontar, R. and Raskutti, G. (2020) Stochastic gradient descent in correlated settings: A study on Gaussian processes. Advances in Neural Information Processing Systems, 33, 2722–2733.
Google Scholar
Currin, C., Mitchell, T., Morris, M. and Ylvisaker, D. (1991) Bayesian prediction of deterministic functions, with applications to the design and analysis of computer experiments. Journal of the American Statistical Association, 86(416), 953–963.
Web of Science ®Google Scholar
Daley, R. (1993) Atmospheric Data Analysis, Number 2. Cambridge University Press, Cambridge, UK.
Google Scholar
Damianou, A. and Lawrence, N. (2013) Deep Gaussian processes, in Artificial Intelligence and Statistics, Scottsdale, AZ, pp. 207–215.
Google Scholar
Deisenroth, M. and Mohamed, S. (2012) Expectation propagation in Gaussian process dynamical systems, in Advances in Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, pp. 2609–2617.
Google Scholar
Dowsland, K.A. and Thompson, J. (2012) Simulated annealing, in Handbook of Natural Computing, Springer, Midtown Manhattan, NY, pp. 1623–1655.
Google Scholar
Foret, P., Kleiner, A., Mobahi, H. and Neyshabur, B. (2020) Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412.
Google Scholar
Frigola, R., Lindsten, F., Schön, T.B. and Rasmussen, C.E. (2013) Bayesian inference and learning in Gaussian process state-space models with particle MCMC, in Advances in Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, pp. 3156–3164.
Google Scholar
Furrer, R., Genton, M.G. and Nychka, D. (2006) Covariance tapering for interpolation of large spatial datasets. Journal of Computational and Graphical Statistics, 15(3), 502–523.
Web of Science ®Google Scholar
Gardner, J., Pleiss, G., Weinberger, K.Q., Bindel, D. and Wilson, A.G. (2018) Gpytorch: Blackbox matrix-matrix Gaussian process inference with GPU acceleration, in Advances in Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, pp. 7576–7586.
Google Scholar
Gramacy, R.B. (2020) Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences, CRC Press, Boca Raton, FL.
Google Scholar
Gramacy, R.B. and Apley, D.W. (2015) Local Gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics, 24(2), 561–578.
Web of Science ®Google Scholar
Gramacy, R.B. and Haaland, B. (2016) Speeding up neighborhood search in local Gaussian process prediction. Technometrics, 58(3), 294–303.
Web of Science ®Google Scholar
Gramacy, R.B. and Lian, H. (2012) Gaussian process single-index models as emulators for computer experiments. Technometrics, 54(1), 30–41.
Web of Science ®Google Scholar
Grünwald, P. (2012) The safe Bayesian, in International Conference on Algorithmic Learning Theory, Springer, Midtown Manhattan, NY, pp. 169–183.
Google Scholar
Guinness, J. (2018) Permutation and grouping methods for sharpening Gaussian process approximations. Technometrics, 60(4), 415–429.
PubMed Web of Science ®Google Scholar
Havasi, M., Hernández-Lobato, J.M. and Murillo-Fuentes, J.J. (2018) Inference in deep Gaussian processes using stochastic gradient Hamiltonian Monte Carlo, in Advances in Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, pp. 7506–7516.
Google Scholar
Henderson, D., Jacobson, S.H. and Johnson, A.W. (2003) The theory and practice of simulated annealing, in Handbook of Metaheuristics, Springer, Midtown Manhattan, NY, pp. 287–319.
Google Scholar
Hensman, J., Fusi, N. and Lawrence, N.D. (2013) Gaussian processes for big data. arXiv preprint arXiv:1309.6835.
Google Scholar
Hensman, J., Matthews, A.G., Filippone, M. and Ghahramani, Z. (2015) MCMC for variationally sparse Gaussian processes, in Advances in Neural Information Processing Systems, Curran Associates, Inc., Red Hook, NY, pp. 1648–1656.
Google Scholar
Hoang, T.N., Hoang, Q.M. and Low, B.K.H. (2015) A unifying framework of anytime sparse Gaussian process regression models with stochastic variational inference for big data, in International Conference on Machine Learning, Association for Computing, New York, NY, pp. 569–578.
Google Scholar
Hoffman, M.D., Blei, D.M., Wang, C. and Paisley, J. (2013) Stochastic variational inference. The Journal of Machine Learning Research, 14(1), 1303–1347.
Google Scholar
Jacot, A., Gabriel, F. and Hongler, C. (2018) Neural tangent kernel: Convergence and generalization in neural networks. arXiv preprint arXiv:1806.07572.
Google Scholar
Jones, B. and Johnson, R.T. (2009) Design and analysis for the Gaussian process model. Quality and Reliability Engineering International, 25(5), 515–524.
Web of Science ®Google Scholar
Joseph, V.R., Gu, L., Ba, S. and Myers, W.R. (2019) Space-filling designs for robustness experiments. Technometrics, 61(1), 24–37.
Web of Science ®Google Scholar
Journel, A.G. and Huijbregts, C.J. (1978) Mining Geostatistics, Volume 600. Academic Press, London.
Google Scholar
Kaufman, C.G., Schervish, M.J. and Nychka, D.W. (2008) Covariance tapering for likelihood-based estimation in large spatial data sets. Journal of the American Statistical Association, 103(484), 1545–1555.
Web of Science ®Google Scholar
Kawaguchi, K., Kaelbling, L.P. and Bengio, Y. (2017) Generalization in deep learning. arXiv preprint arXiv:1710.05468.
Google Scholar
Kennedy, M.C. and O’Hagan, A. (2001) Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(3), 425–464.
Web of Science ®Google Scholar
Krishna, A., Joseph, V.R., Ba, S., Brenneman, W.A. and Myers, W.R. (2020) Robust experimental designs for model calibration. arXiv preprint arXiv:2008.00547.
Google Scholar
Lalchand, V. and Rasmussen, C.E. (2019) Approximate inference for fully Bayesian Gaussian process regression. arXiv preprint arXiv:1912.13440.
Google Scholar
Lindgren, F., Rue, H. and Lindström, J. (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(4), 423–498.
Web of Science ®Google Scholar
Liu, H., Ong, Y.-S., Shen, X. and Cai, J. (2018) When Gaussian process meets big data: A review of scalable GPS. arXiv preprint arXiv:1807.01065.
Google Scholar
Martinez-Cantin, R. (2014) Bayesopt: A Bayesian optimization library for nonlinear optimization, experimental design and bandits. Journal of Machine Learning Research, 15(1), 3735–3739.
Google Scholar
Matheron, G. (1973) The intrinsic random functions and their applications. Advances in Applied Probability, 5(3), 439–468.
Google Scholar
Matthews, A.G.d.G., Rowland, M., Hron, J., Turner, R.E. and Ghahramani, Z. (2018) Gaussian process behaviour in wide deep neural networks. arXiv preprint arXiv:1804.11271.
Google Scholar
Miller, J.W. and Dunson, D.B. (2018) Robust Bayesian inference via coarsening. Journal of the American Statistical Association, 81(3), 519–545.
Web of Science ®Google Scholar
Plumlee, M. (2019) Computer model calibration with confidence and consistency. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 81(3), 519–545.
Web of Science ®Google Scholar
Plumlee, M., Erickson, C., Ankenman, B. and Lawrence, E. (2020) Composite grid designs for adaptive computer experiments with fast inference. Biometrika, 108(3), 749–755.
Web of Science ®Google Scholar
Rana, S., Li, C., Gupta, S., Nguyen, V. and Venkatesh, S. (2017) High dimensional Bayesian optimization with elastic Gaussian process, in Proceedings of the 34th International Conference on Machine Learning, Volume 70. Association for Computing, New York, NY, pp. 2883–2891.
Google Scholar
Rényi, A. (1961) On measures of entropy and information, in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, The Regents of the University of California, Berkeley, CA.
Google Scholar
Ripley, B.D. (1981) Spatial Statistics, Volume 575. John Wiley & Sons, Hoboken, NJ.
Google Scholar
Rose, K., Gurewitz, E. and Fox, G. (1990) A deterministic annealing approach to clustering. Pattern Recognition Letters, 11(9), 589–594.
Web of Science ®Google Scholar
Sacks, J., Welch, W.J., Mitchell, T.J. and Wynn, H.P. (1989) Design and analysis of computer experiments. Statistical Science, 4, 409–423.
Google Scholar
Snelson, E. and Ghahramani, Z. (2006) Sparse Gaussian processes using pseudo-inputs, in Advances in Neural Information Processing Systems, pp. 1257–1264.
Google Scholar
Snoek, J., Larochelle, H. and Adams, R.P. (2012) Practical Bayesian optimization of machine learning algorithms, in Advances in Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, pp. 2951–2959.
Google Scholar
Srinivas, N., Krause, A., Kakade, S.M. and Seeger, M. (2009) Gaussian process optimization in the bandit setting: No regret and experimental design. arXiv preprint arXiv:0912.3995.
Google Scholar
Stein, M.L. (2014) Limitations on low rank approximations for covariance matrices of spatial data. Spatial Statistics, 8, 1–19.
Web of Science ®Google Scholar
Sung, C.-L., Hung, Y., Rittase, W., Zhu, C. and Jeff Wu, C. (2020) A generalized Gaussian process model for computer experiments with binary time series. Journal of the American Statistical Association, 115(530), 945–956.
Web of Science ®Google Scholar
Takapoui, R. and Javadi, H. (2016) Preconditioning via diagonal scaling. arXiv preprint arXiv:1610.03871.
Google Scholar
Thompson, P.D. (1956) Optimum smoothing of two-dimensional fields 1. Tellus, 8(3), 384–393.
Google Scholar
Titsias, M. (2009) Variational learning of inducing variables in sparse Gaussian processes, in Artificial Intelligence and Statistics, Artificial Intelligence and Statistics, Clearwater Beach, FL, pp. 567–574.
Google Scholar
Titsias, M. and Lawrence, N.D. (2010) Bayesian Gaussian process latent variable model, in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Artificial Intelligence and Statistics, Sardinia, Italy, pp. 844–851.
Google Scholar
Tran, D., Ranganath, R. and Blei, D.M. (2015) The variational Gaussian process. arXiv preprint arXiv:1511.06499.
Google Scholar
Tuo, R. and Wang, W. (2020) Kriging prediction with isotropic matern correlations: Robustness and experimental designs. Journal of Machine Learning Research, 21(187), 1–38.
PubMedGoogle Scholar
Wang, K.A., Pleiss, G., Gardner, J.R., Tyree, S., Weinberger, K.Q. and Wilson, A.G. (2019) Exact Gaussian processes on a million data points. In 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.
Google Scholar
Wang, W., Tuo, R. and Jeff Wu, C. (2019) On prediction properties of kriging: Uniform error bounds and robustness. Journal of the American Statistical Association, 115(530), 920–930.
Web of Science ®Google Scholar
Wei, P., Liu, F. and Tang, C. (2018) Reliability and reliability-based importance analysis of structural systems using multiple response Gaussian process model. Reliability Engineering & System Safety, 175, 183–195.
Web of Science ®Google Scholar
Yang, G. (2019) Scaling limits of wide neural networks with weight sharing: Gaussian process behavior, gradient independence, and neural tangent kernel derivation. arXiv preprint arXiv:1902.04760.
Google Scholar
Yue, X. and Kontar, R.A. (2020a) Why non-myopic Bayesian optimization is promising and how far should we look-ahead? A study via rollout, in International Conference on Artificial Intelligence and Statistics, Artificial Intelligence and Statistics, Sicily, Italy, pp. 2808–2818.
Google Scholar
Yue, X. and Kontar, R.A. (2020b) Joint models for event prediction from time series and survival data. Technometrics, 63(4), 477–486.
Web of Science ®Google Scholar
Zhang, Q., Chien, P., Liu, Q., Xu, L. and Hong, Y. (2021) Mixed-input Gaussian process emulators for computer experiments with a large number of categorical levels. Journal of Quality Technology, 53(4), 410–420.
Web of Science ®Google Scholar
Zhu, H., Williams, C.K., Rohwer, R. and Morciniec, M. (1997) Gaussian regression and optimal finite dimensional linear models, Aston University, Birmingham, UK.
Google Scholar
Zomaya, A.Y. and Kazman, R. (2010) Simulated annealing techniques, in Algorithms and Theory of Computation Handbook: General Concepts and Techniques, second edition, Chapman and Hall/CRC, Parkway, NW, pp. 33–33.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Optimize to generalize in Gaussian processes: An alternative objective based on the Rényi divergence

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Optimize to generalize in Gaussian processes: An alternative objective based on the Rényi divergence

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date