Abstract
The detection of changes in the distribution of process variables is referred to as the change-point problem. Existing methods focus on detecting a single (or few) change point in a univariate (or low-dimensional) process. We consider the important high-dimensional multivariate case with multiple change points and without an assumed distribution. In this work the problem is transformed into a supervised learning problem with time as the output response and the process variables as inputs. Our focus is to identify the subset of variables that change. This important, practical scenario is analysed through a supervised learner with a variable importance measure that is used to identify the variables that change among hundreds of variables. Simulated cases are discussed in the paper to verify the proposed method. Moreover, the same data sets are compared with a multivariate exponentially weighted moving average control chart and the advantages of the supervised learner are illustrated.
Acknowledgement
This material is based upon work supported by the National Science Foundation under grant No. 0355575.