Abstract
This paper addresses the Bayesian calibration of dynamic models with parametric and structural uncertainties, in particular where the uncertain parameters are unknown/poorly known spatio-temporally varying subsystem models. Independent stationary Gaussian processes with uncertain hyper-parameters describe uncertainties of the model structure and parameters, while Karhunen-Loeve expansion is adopted to spectrally represent these Gaussian processes. The Karhunen-Loeve expansion of a prior Gaussian process is projected on a generalized Polynomial Chaos basis, whereas intrusive Galerkin projection is utilized to calculate the associated coefficients of the simulator output. Bayesian inference is used to update the prior probability distribution of the generalized Polynomial Chaos basis, which along with the chaos expansion coefficients represent the posterior probability distribution. The proposed method is demonstrated for calibration of a simulator of quasi-one-dimensional flow through a divergent nozzle with uncertain nozzle area profile. The posterior distribution of the nozzle area profile and the hyper-parameters obtained using the proposed method are found to match closely with the direct Markov Chain Monte Carlo-based implementation of the Bayesian framework. Efficacy of the proposed method is demonstrated for various choices of prior. Posterior hyper-parameters of the model structural uncertainty are shown to quantify acceptability of the simulator model.
Introduction
With the present ubiquitous use of computer simulators for scientific investigations, uncertainty quantification and calibration of the simulator models is identified as an important area of research.[Citation1–Citation7] Significant developments of the last decade have established the Bayesian framework as a preferred method for uncertainty quantification and calibration of computer simulators.[Citation8–Citation16] This paper explores a Bayesian framework for calibration and credibility assessment of a computer simulator, , with possibly multivariate output. The simulator accepts deterministic control inputs and uncertain parameters . This paper concerns simulators with uncertain spatial functions, thus , where arguments may be different from . The uncertain functions are treated as uncertain subsystem models in this paper. Though the simulator is deterministic in a sense that each run with same arguments always provide same output, the simulator output is uncertain owing to uncertainty in these parameters. Aim of the framework is to infer using Bayesian calibration whenever new experimental observations are available. Figure shows schematic of the proposed framework for a complex system consisting of physically or mechanically interconnected subsystems.
The most appealing facet of the Bayesian framework is its ability to provide the complete posterior statistics. However, barring very simple cases, statistical sampling techniques are required for solution of the Bayesian calibration and uncertainty propagation problems. Markov Chain Monte Carlo (MCMC) methods [Citation17, Citation18] are most widely used sampling techniques for Bayesian calibration. However, exploration of the posterior distribution using MCMC requires collection of a large number of samples for satisfactory approximation (often in the range of ), rendering the Bayesian framework computationally prohibitive for a large-scale system simulator. Thus, for a generalized application of a Bayesian framework to complex large-scale system simulators, development of a computationally efficient Bayesian calibration technique is essential.
Marzouk et al. [Citation19, Citation20] have proposed a spectral projection based method for computationally efficient Bayesian calibration. The method uses a spectral expansion of a prior in generalized Polynomial Chaos (gPC) basis. Note that the gPC method is generalization of the Polynomial Chaos based spectral projection method, which is extensively investigated in the literature as a computationally efficient alternative to statistical methods for uncertainty propagation with comparable accuracy.[Citation21] Polynomial Chaos method is based on the concept of Homogeneous Chaos introduced by Wiener [Citation22, Citation23], where a random variable is spectrally expanded in terms of Hermite polynomials. Cameron and Martin [Citation24] have shown that any non-linear functional can be expanded in terms of a series of Hermite polynomials in sense. Although earlier attempts at using the polynomial Chaos method (especially for turbulent fluid flow modeling) were not very successful,[Citation25–Citation27] the method is found to be useful for the solution of stochastic finite element [Citation28–Citation30] and stochastic fluid flow problems.[Citation31, Citation32] Xiu and Karniadakis [Citation33, Citation34] have generalized the Polynomial Chaos method for spectral projection in terms of the Askey scheme of polynomials.[Citation35] The resultant gPC method has been applied by various researchers for uncertainty propagation through simulators of systems of engineering importance.[Citation36–Citation39]
The method of Marzouk et al. [Citation19, Citation20] uses gPC expansion of as(1) (1) where are gPC expansion coefficients, are orthogonal polynomials from the Askey scheme, while is a set of independent random variables. Galerkin projection is used to calculate the corresponding gPC expansion coefficients of as(2) (2) where denote an inner product on a Hilbert space of the Gaussian random vector . Equation (Equation2(2) (2) ) can be solved using either intrusive or non-intrusive Galerkin projection. The intrusive method requires new discretization, numerical algorithms, and the code development for the reformulated governing equations using Galerkin projection, hence the nomenclature. On the contrary, non-intrusive method uses stochastic collocation approach to solve the reformulated governing equations by solving the deterministic model at appropriate collocation points. See Najm [Citation39] and references therein for detailed discussion on intrusive and non-intrusive Galerkin projection methods. The gPC expansion given by Equation (Equation2(2) (2) ) of the simulator predictions is used to define the likelihood. On availability of the experimental observations, probability distribution of is updated using the Bayesian calibration. Note here that the Bayesian inference is reformulated as sampling from the posterior distribution of .
In this paper, the method proposed by Marzouk et al. [Citation19, Citation20] is extended for priors with uncertain hyper-parameters. Though a family of probability distributions to represent the prior uncertainty can be specified, associated hyper-parameters are rarely known. Realistic quantification of prior uncertainty requires specification of probability distributions for uncertain hyper-parameters. Use of the uncertain hyper-parameters is more ubiquitous in case of calibration of simulators with model structural uncertainty. Hierarchical Bayesian inference is proposed in the literature for calibration in presence of uncertain hyper-parameters.[Citation10, Citation11] However, the methodology proposed by Marzouk et al. [Citation19, Citation20] does not explicitly consider the effect of uncertain hyper-parameters in the formulation. To make the spectral stochastic projection based Bayesian inference more accurate, it is necessary to include uncertain hyper-parameters in the formulation.
This paper proposes an extension of the method of Marzouk et al. [Citation19, Citation20] to take into consideration uncertainty in hyper-parameters of the prior distribution. A methodology is proposed to obtain a Karhunen-Loeve expansion (KL expansion) of a stochastic process in terms of functions of the hyper-parameters. The prior uncertainty in these hyper-parameters is expanded in a gPC basis. Galerkin projection is used to evaluate gPC coefficients of the resultant KL expansion terms of a stochastic process. The prior uncertainty in a subsystem model, represented in terms of gPC basis, is propagated to simulator predictions using the Galerkin projection approach.[Citation30] The Bayesian calibration is reformulated as an MCMC sampling from the posterior distribution of the random vector . Samples from the posterior distribution of uncertain parameters can then be obtained by substituting each posterior sample of into gPC expansion. Posterior distribution of the model structure defines the credibility of the simulator model. The posterior parameters are identified which quantifies acceptability of the simulator.
This research extends existing state of the art by: (a) extending the gPC expansion based method of Marzouk et al. [19,20] for priors with uncertain hyper-parameters; and (b) providing guidelines for acceptability of the simulator model using hyper-parameters of the posterior distribution. Note that a preliminary version of this work was reported in,[Citation40] while this article is significantly expanded by including (a) model structural uncertainty; (b) substantially elaborate theoretical analysis; and (c) additional numerical results to establish computational efficiency and efficacy of the method.
The proposed method is demonstrated using a simulator of a quasi-one-dimensional flow through a nozzle. The particular choice of the application is motivated by the fact that the quasi-one-dimensional nozzle flow is well understood and can be simulated with limited computational resources. Spacial variation of the nozzle area profile, , is assumed uncertain, thus . Gaussian process prior is used to represent uncertainty in the nozzle area profile, while the proposed Bayesian framework is used to infer the nozzle area profile. It may be noted that though the method is demonstrated for Gaussian process prior in the present paper, the proposed method is generic in nature and can admit arbitrary prior. Further, since the proposed method updates probability distribution of , posterior distribution of the nozzle area profile will be non-Gaussian.
The rest of the paper is organized as follows. Section 2 provides a statistical formulation of the problem. In section 3, the proposed method is discussed in detail. In section 4, numerical results for the calibration of a quasi-one-dimensional nozzle flow simulator are presented and finally in section 5, the paper is summarized and concluded.
Statistical formulation
Let a system be investigated by observing system responses, while, be an available simulator of the system response, where is a set of deterministic control inputs and is an uncertain subsystem model, where arguments may be different from . For brevity, the discussion presented in this paper assumes a single uncertain subsystem model; however, the proposed method can be extended to a more generic case without any change. Let denote the ‘true’ subsystem model, while is a model representative of reality. Note that the simulator approximates the system response within limits of available knowledge. Thus conditional on the true subsystem model , deviates from by(3) (3) where is a discrepancy function.
Let represent an experimental observation of the system at a control input setting , while denote the corresponding measurement uncertainty. Relationship between experimental observation and the true system response is given by(4) (4) Let experimental observations be obtained at input conditions and denote a set of experimental observations by where Similarly, define and . Also define , where typically is significantly greater than . Note that represent a realization of a random function at input settings. Information available from the experimental observations is used in the Bayes theorem as(5) (5) where is the prior, is the likelihood and is the posterior probability distribution.
Prior probability in and is specified using independent Gaussian processes. However, a complete definition of the prior requires the specification of uncertain hyper-parameters of the probability distribution. Let and be the hyper-parameters of subsystem model and discrepancy function respectively, where and are set of possible values. Thus, in the presence of the uncertain hyper-parameters, posterior probability distribution (Equation5(5) (5) ) takes the form(6) (6) For better readability, dependence of the uncertain subsystem model and the discrepancy function on uncertain hyper-parameters is explicitly shown henceforth.
In the present paper, Bayesian inference is developed assuming a Gaussian process prior for the discrepancy functions, as it is extensively used in the literature for specification of prior on random functions.[Citation10, Citation41–Citation43] The prior uncertainty in the subsystem model and the discrepancy function is assumed to be independent. Uncertainty in the experimental observations is specified by a zero-mean normally distributed random variable with standard deviation . The experimental uncertainty at different input settings is assumed to be uncorrelated. By marginalization over , the posterior distribution is given by(7) (7) where while , being the identity matrix.
The Metropolis-Hastings algorithm [Citation44, Citation45] is used to sample from the posterior distribution (Equation7(7) (7) ). For each sample, evaluation of , and impose significant computational expenses on the Bayesian framework. In the present paper, a gPC expansion based method is proposed for computationally efficient evaluations of , and .
Generalized polynomial Chaos expansion of a Gaussian process with uncertain hyper-parameters
For brevity, the proposed spectral formulation is described first for a susbsystem model, , which is subsequently extended for a discrepancy function. The formulation is derived for a zero-mean Gaussian process, however, it can easily be extended for non-zero mean processes. The derivation uses KL expansion of the Gaussian process with uncertain hyper-parameters, which is projected on a gPC basis using the intrusive Galerkin projection.
KL expansion
Let be a zero-mean Gaussian process with a covariance function . The covariance function can be approximated as [Citation30](8) (8) where is the number of expansion terms retained in the spectral approximation. and are eigenvalues and eigenfunctions of the covariance kernel, which are given by solution of the Fredholm’s integral equation of the second kind [Citation46](9) (9) Explicit dependence of the eigenvalues and eigenfunctions on the hyper-parameters should be noted. The resultant KL expansion using (Equation8(8) (8) ) is given by [Citation30](10) (10) where are independent zero-mean standard normal random variables. For a given , the eigenvalue problem (Equation9(9) (9) ) can be numerically solved using a Galerkin projection based approach.[Citation47] In this paper, the approach is extended for uncertain hyper-parameters as follows.
Eigenfunctions can be spectrally approximated as(11) (11) where are Legendre polynomials and are respective expansion coefficients.[Citation30, Citation47] Use (Equation11(11) (11) ) in (Equation9(9) (9) ), multiply both sides by and integrate w.r.t. to obtain(12) (12) Using(Equation12(12) (12) ) can be written in a matrix form as(13) (13) which is a generalized eigenvalue problem (GEP) that can be solved using the QZ algorithm.[Citation30]
The matrices in (Equation13(13) (13) ) are functions of . Thus, to solve (Equation13(13) (13) ), consider spectral expansion of eigenvalues and eigenfunctions as(14) (14) where are Legendre polynomials that form complete orthonormal basis on . To ensure orthonormality of the Legendre polynomials, are scaled using minimum and maximum value of . Though the method assumes information about the range of hyper-parameters , specification of complete prior distribution is not necessary. Thus, a conservative estimate of the range suffices at this juncture. The coefficients are given by(15) (15) Using the Gauss-Legendre quadrature, the integrals can be approximated as(16) (16) where are the number of quadrature points used, are the quadrature nodes, while are the respective quadrature weights. The expansion coefficients, and , are calculated by solving (Equation13(13) (13) ) at quadrature nodes and substituting the solution in (Equation16(16) (16) ) to evaluate integrals in (Equation15(15) (15) ).
gPC expansion of a Gaussian process
Hyper-parameters can be expanded in gPC basis as [Citation34](17) (17) where are the Hermite polynomials, are respective expansion coefficients, while is a vector of independent standard normal random variables. Using the Galerkin projection, gPC expansion of is given by(18) (18) Using (Equation14(14) (14) ) and (Equation18(18) (18) ), gPC expansion of is given by(19) (19) where and denote an inner product. Similarly, the gPC expansion of eigenfunctions can be obtained using (Equation14(14) (14) ) and (Equation18(18) (18) ) as(20) (20) Thus, a zero-mean Gaussian process can be expanded in gPC basis as(21) (21) where the expansion coefficients are given by using (Equation10(10) (10) ), (Equation19(19) (19) ) and (Equation20(20) (20) ) as(22) (22) whereis a quadruple product.
Stochastic spectral projection based Bayesian calibration
The prior uncertainty in the subsystem model, , and the discrepancy function, , is given by independent Gaussian processes. Using (Equation21(21) (21) ), the spectral expansion of the prior is given by(23) (23) Galerkin projection approach is used to propagate the prior uncertainty in to the simulator predictions,[Citation37] thus,(24) (24) Using the gPC expansion of and in (Equation3(3) (3) ) to obtain the gPC expansion of as(25) (25) where Use the gPC expansion (Equation25(25) (25) ) in (Equation4(4) (4) ) to obtain(26) (26) Note that are the only uncertain variables in (Equation26(26) (26) ), thus, the Bayesian inference problem is reformulated as sampling from the posterior distribution of .
Let and define the sets of respective gPC coefficients. Using the gPC expansion of the uncertain variables ((Equation23(23) (23) ) – (Equation26(26) (26) )) in (Equation7(7) (7) ), the Bayesian calibration problem can be reformulated in terms of that capture all the randomness in the system model and hyper-parameters:(27) (27) Since are i.i.d. standard normal random variables with independent Gaussian uncertainties in experimental data, the posterior distribution of can be obtained by:(28) (28) where .
Markov Chain Monte Carlo (MCMC) method is used to sample from (Equation28(28) (28) ). Note that MCMC applied to (Equation28(28) (28) ) does not require solution of the simulation model, , thus, the posterior distribution can be explored efficiently. However, solution of (Equation28(28) (28) ) requires numerical evaluation of and , which may impose non-trivial computational cost on the MCMC sampling. In the present paper, numerical evaluation of and is accelerated as follows.
The numerical evaluation of and is accelerated using the gPC expansion of the individual elements of the covariance matrix as(29) (29) The determinant and the individual elements of the inverse can also be expanded in gPC basis as(30) (30) where the gPC expansion coefficients are given by(31) (31) Gaussian quadrature is used to evaluate Polynomial Chaos coefficients. and are calculated by evaluating and at quadrature nodes, while and are evaluated using (Equation31(31) (31) ). The resultant gPC expansion (Equation30(30) (30) ) is used in (Equation28(28) (28) ) for MCMC sampling.
Numerical example: calibration of quasi-one-dimensional nozzle flow simulator
Problem setup
A quasi-one-dimensional supersonic flow through a nozzle is considered to verify the proposed Bayesian calibration method. The quasi-one-dimensional flow is uniform across the cross-section with properties varying in -direction, while, the effect of change in the cross-sectional area is considered. The flow is defined using the compressible Euler equations in conservative form(32) (32) where,Here, is density, is velocity, is cross-sectional area, is static pressure and is the total energy per unit mass. In the governing equations, static pressure is substituted by the total energy per unit mass using(33) (33) while the ideal gas equation is used for the closure.
Variation of nozzle area profile is assumed to be uncertain, which is inferred using the Bayesian framework. Stationary Gaussian process with known mean profile is used as a prior for uncertain nozzle area profile. The procedure of obtaining the stochastic spectral formulation is summarized as follows. Define , , and the corresponding gPC expansionsMultiplying both the sides by and taking the inner product, governing equations of a stochastic quasi-one-dimensional nozzle flow is given by(34) (34) whereand . Note that the governing equations is a set of partial differential equations. These governing equations are numerically solved using the central difference scheme in the spatial dimension and the fourth order Runge-Kutta method in the temporal dimension. A uniform grid with is used in the spatial dimension, while the time step is used for time integration. The inner products involved in the governing equations are evaluated as a-priory using the Gauss-Hermite quadrature.
For demonstration purpose, the proposed method is applied using ‘hypothetical test bed nozzle’ data. The ‘hypothetical nozzle’ can be created using the nozzle simulator with sensors that can measure the nozzle responses of interest with typical accuracies. A hypothetical nozzle is specified through a set of parameters and ‘true’ area profile, all of which are treated as precisely known. Total 5 sensors are assumed to be equally placed along the x-direction of the nozzle. In all the test cases presented in this section, density , velocity , pressure and static temperature are used as responses of interest, which are assumed to be measured by the sensors. Steady state predictions using known nozzle area profile are used as experimental observations. The experimental uncertainty is specified by a zero-mean normally distributed random variable with standard deviation given by 1% of the mean value. For all the test cases, inflow conditions are Mach number , static pressure and density . Figure shows a typical nozzle area profile and spatial variation of normalized responses at steady state. Note that supersonic flow through a divergent nozzle results in increased velocity, while the static pressure, temperature and density are decreased. As can be observed from Figure , this behaviour is captured well by the nozzle simulator.
Prior uncertainty propagation
Prior uncertainty in the nozzle area profile is specified using a Gaussian process with known mean and the covariance function is given by(35) (35) where is the variance and is the correlation length. and are treated as uncertain. Prior uncertainty in is specified using an inverse Gamma distribution, , while Gamma distribution, , is used as a prior for . Here is shape parameter and is rate parameter for Gamma distribution, while is shape parameter and is scale parameter for inverse Gamma distribution.[Citation48] Figure shows prior mean nozzle area profile while Figure shows prior probability distribution. Comparison of prior mean with ‘true’ nozzle area profile is also shown in Figure .
The Bayesian framework is implemented using the intrusive gPC expansion for propagation of the prior uncertainty to the system response. The prior uncertainty is projected on a Hermite Polynomial Chaos basis. To investigate the trade-off between computational efficiency and accuracy, results of Polynomial Chaos method for uncertainty propagation are compared with the Monte Carlo simulation. Total 100,000 samples are used for the Monte Carlo method. All the computations are performed on a desktop computer with intel i5-460 processor. CPU time is estimated using a FORTRAN intrinsic cpu_time routine. Figure shows the comparison of computational time requirement as a function of number of eigenfunctions used, , for different Polynomial Chaos order, . As can be observed from the figure, the computational cost increases polynomially with . In particular, for a system with stochasticity of order , the computational cost increases as , where is the order of non-linearity of the system and is order of the gPC basis.[Citation34]
Figure shows the -error in the mean and the variance. The -error is calculated with respect to the Monte Carlo method. With the increase in number of eigenfunctions used and the polynomial order, error in the mean and the variance reduces. For and the order Polynomial Chaos, method provides prediction of system response with error of the order of in both mean and variance at -times lower computational cost. Note that the computational cost of the proposed Bayesian calibration method is dominated by the computational time requirement for solution of forward propagation problem, while the computation cost of the MCMC sampling is negligible. Thus, the conclusions drawn from the computational cost comparison for forward propagation can be extended to solution of the inverse problem without any change.
Spectral projection-based Bayesian calibration
Baseline case
The computational cost of the proposed method is compared with the direct MCMC sampling for the Bayesian calibration. Prior for the uncertain nozzle area profile is specified using the Gaussian process with the uncertain variance and the covariance length. prior is used for the variance, while prior is used for the correlation length. First four eigenfunctions are used in the spectral expansion, while second-order Hermite polynomials are used as the gPC basis. Model structural uncertainty is quantified by prior for the variance and prior for the correlation length of the covariance function of the discrepancy function. The gPC expansion coefficients of the simulator output, obtained using the intrusive Galerkin projection, are used in the likelihood function to define the posterior distribution, which is explored using the MCMC sampling. Gaussian distribution centred on the present state is used as a proposal distribution for the Markov Chain. Total 100,000 samples are collected after rejecting the initial 10,000 samples. Results of the proposed method are compared with the direct implementation of MCMC for the Bayesian calibration, where the simulator output is used to define the likelihood. Total 10,000 samples are collected using the direct MCMC after the initial burnout period of 1000 samples. The total CPU time required for the implementation of the Bayesian framework using the direct MCMC sampling is 12732.65 seconds. The CPU time requirement for the proposed gPC expansion based Bayesian framework is split into the time required for the intrusive Galerkin projection of the prior uncertainty to the system response and the MCMC sampling from the posterior distribution. For the present test case, intrusive Galerkin projection is implemented in 1194.15 seconds, while the total CPU time for the MCMC sampling is 8.48 seconds, taking 1202.63 seconds for complete implementation of the proposed stochastic spectral projection based Bayesian framework.
Figure shows comparison of the posterior mean nozzle area profile obtained using the direct MCMC and the proposed method. The prior mean nozzle area profile is also shown in the figure. The posterior mean nozzle area profile obtained using both the methods matches closely with the ‘true’ nozzle area profile. The comparison of the posterior distribution for the hyper-parameters of the uncertain nozzle area profile is shown in Figure . The probability distribution of the hyper-parameters is not updated noticeably after the Bayesian calibration, as the experimental observations of the system response are not expected to contain significant information about the covariance structure of the uncertain nozzle area profile. From the figures, it may be concluded that the proposed spectral projection based Bayesian framework provide inference of the uncertain functions with accuracy comparable to the implementation of the direct MCMC sampling at significantly lower computational cost.
Choice of discrepancy function
To investigate the effect of choice of the prior for discrepancy function, Bayesian framework is implemented using different priors for variance. Prior uncertain in the nozzle area profile is specified as discussed earlier. Prior uncertainty in discrepancy function is specified using zero-mean Gaussian process with squared exponential covariance function (Equation35(35) (35) ), where the variance and the correlation length are uncertain hyper-parameters. For all the test cases presented in this section, prior in correlation length is represented using , specifying correlated discrepancy function. For variance, and priors are investigated. To simulate model structure uncertainty, hypothetical test-bed data are obtained using viscous model, whereas, simulator is defined using inviscid model. Total 5 testbed data points are used with 1% standard deviation. Figure shows comparison of posterior mean nozzle area profile with true nozzle area profile and the prior mean. Posterior mean for prior matches closely with the true nozzle area profile, whereas posterior mean nozzle area profile for prior deviates from the true nozzle area profile. Figure demonstrates the significant impact of the prior model structural uncertainty on the inference of the uncertain functions. Confidence on the available model is specified through the prior on the variance. In the case of high confidence on the simulator model, specified through the high value of for prior, significant amount of the information provided by the data is used to update the parameter uncertainty. If the low confidence prior is specified for the model through the lower value of for prior, the calibration process uses significant information provided by the data to improve confidence on the simulator model, whereas less information is used for updating the uncertain parameters. Note that in the case of very high confident prior on the model structure (approaching the scenario of no model structural uncertainty), calibration process attributes any remnant error in the model to the parameters. Thus, the present authors propose to avoid priors that specify either low confidence or very high confidence on the model structure. In the remaining test cases presented in this paper, prior is used for the variance of the discrepancy function.
Effect of simulator model error
To investigate efficacy of the calibration process in presence of error in simulator model, artificial discrepancy is introduced in the simulator model by multiplying the gPC expansion coefficients of the static pressure by 1.5. Modified chaos coefficients are used in the Bayesian framework. The framework is implemented using following two test cases: (1) simulator model without taking into account discrepancy function (which signifies full confidence on the simulator model), and (2) and priors for correlation length and variance of the discrepancy function ( and , respectively). Figure shows comparison of posterior nozzle area profile with the true nozzle area profile. For calibration without considering discrepancy function, Bayesian framework assumes simulator to be the ‘true’ representation of the physics. The Bayesian calibration method attributes all the difference between the test-bed data and the simulator prediction to the uncertain parameters. Thus, resultant posterior mean deviates significantly from the true nozzle area profile. However, when discrepancy function is considered in the Bayesian calibration, no significant update is observed in the nozzle area profile.
Figure shows comparison of the prior and the posterior probability distribution of and . The posterior probability distribution of for the static pressure is shifted towards left, indicating the lower posterior expected value of as compared to the prior. However, no significant change is observed for the probability distribution of . Note that the prior indicated correlated discrepancy in the simulator model. Thus, the posterior probability distribution of the discrepancy function for the static pressure indicates correlated discrepancy in the simulator model with the high expected value of . As the error in the experimental observations enters the Bayesian framework as the uncorrelated uncertainty, the highly correlated discrepancy is expected to result from the error in the simulator model. Thus, the posterior distribution of the discrepancy function signals the need for verification and validation of the simulator, particularly subroutines affecting the static pressure.
Effect of erroneous observations
To investigate effect of the erroneous experimental observations on the Bayesian framework, proposed method is implemented with artificially introduced discrepancy in test-bed data. The methodology is demonstrated by multiplying 1.5 to static pressure data. Figure shows resultant posterior mean nozzle area profile. As observed in case of artificially introduced discrepancy in the simulator, significant deviation is observed between posterior mean and true nozzle area profile when discrepancy function is not considered in the formulation. However, when discrepancy function is considered, no significant change is observed in the nozzle area profile.
Figure shows posterior probability distribution for and . Posterior distribution of for static pressure has moved towards left, indicating the high posterior discrepancy. Posterior probability distribution of for static pressure is shifted towards right, indicating highly uncorrelated discrepancy. Since the uncertainty due to unknown or poorly known physics is expected to result in the correlated discrepancy, uncorrelated discrepancy may be attributed to the experimental observations. Thus, the posterior distribution indicates need for the review of the experimental observations.
Summary
The numerical test cases presented in this paper have demonstrated ability of the proposed Bayesian framework to infer uncertain spatial/temporal functions in the presence of model structural uncertainty. In addition to the inference of the uncertain functions, the proposed Bayesian framework provides useful insight into the credibility of the simulator model. The posterior probability distributions of and , through hyper-parameters and , are identified as indicators of the veracity and validity of the simulator model. Based on the posterior values of the hyper-parameters and for the posterior probability distributions of ( and ) and ( and ), authors provide following guidelines:
If the posterior is greater than the prior , then the simulator model should be accepted with improved confidence.
: The posterior distribution does not have mode resulting in maximum probability for . For such posterior, calibrated simulator model should be rejected.
If mean and mode of posterior distribution of indicate strong correlation for discrepancy function (typically or ), rigorous verification and validation process is advised with a focus on subsystem models that predict system responses for which .
If mean and mode of posterior distribution of indicate very weak correlation for discrepancy function (typically or ), review of experimental observations is advised.
For all the other cases, rigorous verification and validation of simulator model is advised with a note on review of experimental observations.
and : Calibrated simulator can be used, however, high uncertainty in prediction of the simulator response should be expected. As per discussion void point (2), verification and validation of simulator model and a review of experimental observations is advised.
and : Calibrated simulator can be used with high confidence.
Concluding remarks
This paper has proposed a gPC based Bayesian framework for calibration of computer simulators. The proposed framework has extended the established method to the priors with uncertain hyper-parameters. Efficacy of the proposed Bayesian framework is demonstrated for calibration of a quasi-one-dimensional divergent nozzle flow simulator. Ability of the method to infer spatially/temporally varying uncertain functions is shown using the update of the nozzle area profile. The proposed method has provided accurate inference of nozzle area profile at one-tenth of a computational cost as compared to the direct implementation of the Bayesian framework. Hyper-parameters of the posterior distribution of model structure uncertainty are identified that provide information about veracity and validity of the computer simulator. Based on the hyper-parameters, guidelines have been provided for acceptability of the simulator model. Although demonstrated for a specific set of priors, the proposed method is generic in nature and can admit arbitrary priors. However, depending on the gPC basis used, higher order polynomials may be required for the satisfactory spectral approximation of the prior, incurring comparatively higher computational cost on the Bayesian framework.
Acknowledgments
This work was supported in part by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0025484), and in part by the KI Project through KAIST Institute for Design of Complex Systems.
Notes
Note that the terminology ‘true’ parameter is not universally accepted in the open literature, with one point of view considering simulator as a black box and uncertain parameters as ‘tuning-parameters’,[Citation10] while other authors refer to uncertain parameters as ‘true’ parameters which are being inferred.[Citation12]
For notational convenience, is replaced by in this section.
Note that though the proposed method is demonstrated for priors, the method can be applied for arbitrary choices of prior.
References
- Mehta U. Some aspects of uncertainty in computational fluid dynamics results. J. Fluid Engg. 1991;113:538–543.
- Mehta U. Guide to credible computer simulations of fluid flows. J. Prop. Power. 1996;12:940–948.
- Oreskes N, Shrader-Frechett K, Belitz K. Verification, validation and confirmation of numerical models in earth sciences. Science. 1994;263:641–647.
- Mehta UB. Credible computational fluid dynamics simulations. AIAA J. 1998;36:665–667.
- Oberkampf W, DeLand S, Rutherford B, Diegert K, Alvin K. Error and uncertainty in modeling and simulation. Rel. Engg. Syst. Safety. 2002;75:335–357.
- Thunnissen D. Propagating and mitigating uncertainty in the design of complex multidisciplinary systems. California Institute of Technology, California: Ph.D. diss; 2004.
- Cheung SH, Oliver TA, Prudencio EE, Prudhomme S, Moser RD. Bayesian uncertainty analysis with applications to turbulence modeling. Rel. Engg. Syst. Safety. 2011;96:1137–1149.
- Trucano T, Swiler L, Igusa T, Oberkampf W, Pilch M. Calibration, validation, and sensitivity analysis: What’s what. Rel. Engg. Syst. Safety. 2006;91:1331–1357.
- Glimm J, Sharp D. Prediction and the quantification of uncertainty. Physica D. 1999;133:152–170.
- Kennedy M, O’Hagan A. Bayesian calibration of computer models. J. Royal Stat. Soc. Series B (Stat. Method.). 2001;63:425–464.
- Higdon D, Kennedy M, Cavendish J, Cafeo J, Ryne R. Combining field data and computer simulations for calibration and prediction. SIAM J. Sci. Comp. 2005;26:448–466.
- Goldstein M, Rougier J. Probabilistic formulations for transferring inferences from mathematical models to physical systems. SIAM J. Sci. Comp. 2005;26:467–487.
- Bayarri M, Berger J, Paulo R, Sacks J, Cafeo J, Cavendish J, Lin C, Tu J. A framework for validation of computer models. Technometrics. 2007;49:138–153.
- Higdon D, Gattiker J, Williams B, Rightley M. Computer model calibration using high-dimensional output. J. Amer. Stat. Assoc. 2008;103:570–583.
- Kelly D, Smith C. Bayesian inference in probabilistic risk assessment - the current state of the art. Rel. Engg. Sys. Safety. 2009;94:628–643.
- Goldstein M, Rougier J. Reified Bayesian modelling and inference for physical systems. J. Stat. Planning Inf. 2009;139:1221–1239.
- Besag J, Green P, Higdon D, Mengersen K. Bayesian computation and stochastic systems. Stat. Sci. 1995;10:3–41.
- Gamerman D, Lopes H. Markov chain Monte Carlo: stochastic simulation for Bayesian inference. Boca Raton: Chapman and Hall/CRC; 2006.
- Marzouk Y, Najm H. Stochastic spectral methods for efficient Bayesian solution of inverse problems. J. Comp. Phys. 2007;224:560–586.
- Marzouk Y, Najm H. Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems. J. Comp. Phys. 2009;228:1862–1902.
- Walters R, Huyse L. Uncertainty analysis for fluid mechanics with applications. NASA/CR-2002-211449, 2002.
- Wiener N. The homogeneous chaos. Amer. J. Math. 1938;60:897–936.
- Wiener N. Nonlinear problems in random theory. New York: John Wiley & Sons; 1958.
- Cameron R, Martin W. The orthogonal development of non-linear functionals in series of Fourier-Hermite functionals. The Annals Math. 1947;48:385–392.
- Meecham W, Jeng D. Use of the Wiener-Hermite expansion for nearly normal turbulence. J. Fluid Mech. 1968;32:225–249.
- Orszag S, Bissonnette L. Dynamical properties of truncated Wiener-Hermite expansions. Phys. Fluids. 1967;10:260–263.
- Chorin A. Gaussian fields and random flow. J. Fluid Mech. 1974;85:325–347.
- Ghanem R, Spanos P. Spectral stochastic finite-element formulation for reliability analysis. J. Engg. Mech. 1991;117:2351–2372.
- Ghanem R, Red-Horse J. Propagation of probabilistic uncertainty in complex physical systems using a stochastic finite element approach. Physica D. 1999;133:137–144.
- Ghanem R, Spanos P. Stochastic finite elements: a spectral approach. New York (NY): Dover Publications; 2003.
- Knio O, Maitre O. Uncertainty propagation in CFD using polynomial chaos decomposition. Fluid Dyn. Res. 2006;38:616–640.
- Maitre O, Knio O, Najm H, Ghanem R. A stochastic projection method for fluid flow. J. Comp. Phys. 2001;173:481–511.
- Xiu D, Karniadakis G. The Weiner-Askey polynomial chaos for stochastic differential equations. SIAM J. Sci. Comp. 2002;24:619–644.
- Xiu D, Karniadakis G. Modeling uncertainty in flow simulations via generalized polynomial chaos. J. Comp. Phys. 2003;187:137–167.
- Koekoek R, Swarttouw R. The Askey-scheme of hypergeometric orthogonal polynomials and its q-analogue, Department of Technical Mathematics and Informatics, Delft University of Technology, Report no. 98–17, 1998.
- Lucor D, Xiu D, Su C, Karniadakis G. Predictability and uncertainty in CFD. Int. J. Num. Meth. Fluids. 2003;43:483–505.
- Mathelin L, Hussaini M, Zang T, Bataille F. Uncertainty propagation for a turbulent, compressible nozzle flow using stochastic methods. AIAA J. 2004;42:1669–1676.
- Narayanan V, Zabaras N. Stochastic inverse heat conduction using spectral approach. Int. J. Num. Methods Engg. 2004;60:1569–1593.
- Najm H. Uncertainty quantification and polynomial chaos techniques in computational fluid dynamics. Ann. Rev. Fluid Mech. 2009;41:35–52.
- Tagade P, Choi HL. A polynomial chaos based Bayesian inference method with uncertain hyperparameters. In ASME International Design Engineering Technical Conference and Computers and Information in Engineering Conference. DC; Washington; 2011.
- Sacks J, Welch W, Mitchell T, Wynn H. Design and analysis of computer experiments. Stat. Sci. 1989;4:409–423.
- Paulo R. Default priors for Gaussian processes. Ann. Stat. 2005;33:556–582.
- OHagan A. Bayesian analysis of computer code outputs: A tutorial. Rel. Engg. Syst. Safety. 2006;91:1290–1300.
- Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E. Equation of state calculations by fast computing machines. J. Chem. Phys. 1953;21:1087–1092.
- Hastings W. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. 1970;57:97–109.
- Chakrabarti A, Martha S. Approximate solutions of Fredholm integral equations of the second kind. App. Math. Comp. 2009;211:459–466.
- Huang S, Quek S, Phoon K. Convergence study of the truncated KarhunenLoeve expansion for simulation of stochastic processes. Int. J. Num. Methods Engg. 2001;52:1029–1043.
- Choi S, Wette R. Maximum likelihood estimation of the parameters of the Gamma distribution and their bias. Technometrics. 1969;11:683–690.