Abstract
In order to reduce data nonlinearity and overfitting with the multivariate calibration model y=Xb, a modified Tikhonov regularization (TR) algorithm is evaluated for selecting key variables from an X augmented with extra columns that contain the original measured variables (x ij ) as squared terms (x ij 2) and other orders. The TR approach simultaneously develops the multivariate calibration model. The new generalized pair‐correlation method (GPCM) is also studied for variable selection followed by partial least squares (PLS) for multivariate calibration. Results from synthetic spectral data are compared when using the modified TR approach, GPCM, and PLS without variable selection. The GPCM usually performs slightly better than the TR approach for tabulated bias and variance measures and in some cases, at a sacrifice to parsimony. The method of PLS without variable selection performs the worst. By using synthetic spectral data sets, how the methods work could be studied. Thus, results from this study will aid investigators of real spectral data sets exhibiting nonlinear behavior.
Notes
This material is based upon work supported by the National Science Foundation under Grant No. CHE 0400034 and is gratefully acknowledged by the authors. This research was also partially supported by Grant No. 948 from the Faculty Research Committee, Idaho State University. The authors are grateful to Károly Héberger for providing the GPCM Excel program.