713
Views
0
CrossRef citations to date
0
Altmetric
High-Dimensional and Big Data

Predictive Subdata Selection for Computer Models

ORCID Icon
Pages 613-630 | Received 10 Nov 2021, Accepted 24 Jun 2022, Published online: 25 Jul 2022
 

Abstract

An explosion in the availability of rich data from the technological advances is hindering efforts at statistical analysis due to constraints on time and memory storage, regardless of whether researchers employ simple methods (e.g., linear regression) or complex models (e.g., Gaussian processes). A recent approach to overcoming these limits involves information-based optimal subdata selection and Latin hypercube subagging. In the current study, we develop a novel subdata selection method for large-scale computer models based on expected improvement optimization. Numerical and empirical analysis using real-world data are used to select subdata by which to derive accurate predictions. During the optimization procedure, the proposed scheme employs the geometry of the input feature region as well as information related to output values. The data points associated with the largest improvement in prediction accuracy are combined in the construction of a subdataset that can be used to formulate predictions with affordable computing time. Supplementary materials for this article, including proofs of theorems and additional numerical results, are available online.

Supplementary Materials

The supplementary materials include proofs of theorems and additional numerical results.

Appendix: Section S1: the proofs of (3), Theorems 1 and 2. Section S2: the numerical results for Piston function (d = 7) and Wing Weight function (d = 10). Sections S3 and S4: the numerical studies for Bias-Prediction and Larger-d investigations

R code: R programs which can be used to replicate the numerical results in this article.

Acknowledgments

We thank the editor, associate editor, and two anonymous referees for their constructive comments and suggestions, which have helped us to improve the article.

Additional information

Funding

We gratefully acknowledge funding from Academia Sinica with grant number AS-CDA-111-M05.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 180.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.