1,041
Views
6
CrossRef citations to date
0
Altmetric
Research Article

Multioutput least square SVR-based multivariate EWMA control chart: The performance evaluation and application

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon | (Reviewing editor)
Article: 1531456 | Received 03 Jul 2018, Accepted 28 Sep 2018, Published online: 19 Oct 2018

Abstract

Autocorrelation leads to a bias estimator of standard control charts. It is important to develop control chart that allows autocorrelation and to evaluate its performance. The objective of this paper is to evaluate the performance of multioutput least square support vector regression (MLS-SVR)-based multivariate exponentially weighted moving average (MEWMA) control chart for monitoring multivariate autocorrelated data. For first order of vector autoregressive (VAR) and first order of vector moving average data, the proposed control chart tends to yield stable in-control average run length at about 200. The proposed control chart becomes more insensitive due to the increase of MEWMA smoothing parameter. In the real application, the proposed method is successfully applied to monitor water turbidity and chlorine residual data in the drinking water manufacturing process.

PUBLIC INTEREST STATEMENT

A control chart is one of the improvement tools to monitor how a process changes over time. A good control chart will provide precise conclusion if the special causes of variation affect the actual process. The multivariate control chart for monitoring variables with time dependency has been developed. In this study, one of the multivariate control charts for monitoring autocorrelated data named MLS-SVR-based MEWMA control chart is evaluated based on two types of linear time series data. Furthermore, the proposed control chart draws a correct decision when the drinking water manufacturing process needs to be improved.

1. Introduction

Traditional control chart usually assumes that variables are statistically independent over time. However, data collected in time often show serial dependency and always affect false alarm rate and shift detection power. Consequently, a control chart developed under the assumption of independent observation would have poor performance when it is applied to monitor autocorrelated processes. To overcome this problem, researchers developed two general approaches to deal with autocorrelation in the process. First, fitting time series models to the autocorrelated data then monitoring the residuals using conventional control chart (Alwan & Roberts, Citation1988; Montgomery & Mastrangelo, Citation1991). Second, traditional control charts with modified control limits are used to monitor autocorrelated data (Lu & Reynolds, Citation1999; Vanbrackle & Reynolds, Citation1997). First approach seems to work better in high level of autocorrelation while second approach will work better in low-to-moderate level of autocorrelation (Yashchin, Citation1993).

Some researchers investigated the effect of autocorrelation on the control chart performance. Serially dependent data may lead to incorrect out-of-control signal and reduce the effectiveness of a control chart (Noorossana & Vaghefi, Citation2006). If data follow autoregressive moving average model with order (1,1), exponentially weighted moving average (EWMA) control chart will detect shift more quickly than Shewhart control chart (Wardell, Moskowitz, & Plante, Citation1994). Jarrett and Pan (Citation2007) evaluated mean shift effect on the performance of multivariate control chart based on the residual of vector autoregressive (VAR) model. For small autocorrelation coefficient, Hotelling’s T 2 control chart based on the residual of VAR model is effective to detect small shift of mean (Pan & Jarrett, Citation2007).

One of the well-known performance indicators to evaluate and to compare the effectiveness of various control charts is average run length (ARL). However, the distribution of run length is affected by serial correlation (Bagshaw & Johnson, Citation1975; Johnson & Bagshaw, Citation1974). Böhm and Hackl (Citation1996) derived close form approximation of in-control ARL to evaluate the effect of autocorrelation on the performance of cumulative sum (CUSUM) control chart. Both Shewhart and CUSUM charts become less sensitive to detect small shift in the mean when applied to the residual of autocorrelated process (Runger, Wlllemain, & Prabhu, Citation1995). Kramer and Schmid (Citation1997) evaluated the performance of residual-based multivariate exponentially weighted moving average (MEWMA) control chart for first order of bivariate VAR model. Śliwa and Schmid (Citation2005) found that for monitoring cross-covariance purpose, residual-based MEWMA control chart has better performance than modified MEWMA control chart.

Recently, some researchers also use machine learning to improve the performance of residual-based multivariate control chart. Khediri, Weihs, and Limam (Citation2010) studied the performances of residual-based T 2, CUSUM of T 2 (COT), and T 2 of EWMA (EWMAT) control chart. Khediri et al. (Citation2010) proved that multivariate control charts based on the residual of support vector regression (SVR) are more effective than multivariate control charts based on the residual of Artificial Neural Network (ANN). The SVR-based MCUSUM control chart has better performance than MCUSUM control charts based on the residual of ANN and VAR (Issam & Mohamed, Citation2008).

SVR is one of the machine learnings that provides global optimal solution. Meanwhile, least square SVR (LS-SVR) is least square version of standard SVR which replaces quadratic programming problem with linear programming problem (Vapnik, Citation1998, Citation1995) and changes the inequality constraints formula by the equality ones (Suykens, Gestel, Brabanter, Moor, & Vandewalle, Citation2002; Suykens & Vandewalle, Citation1999). The linear equation in LS-SVR is simple to solve and good in computational time saving. When the variables of interest are more than one, the issue becomes multioutput case. Xu, An, Qiao, Zhu, and Li (Citation2013) introduced multioutput LS-SVR (MLS-SVR) in order to cover the Hierarchical Bayes (Heskes, Citation2000) intuition, where each output is permitted to have different slope function. The MLS-SVR has ability to map the multivariate input space to the multivariate output space.

Khusna, Mashuri, Suhartono, Prastyo, and Ahsan (Citation2018) proposed MLS-SVR-based MEWMA control chart and evaluated its ability to detect the presence of both additive and innovative outliers in the simulation data which follow vector autoregressive and moving average model with order (1,1). However, the performance of MLS-SVR-based MEWMA control chart based on ARL criteria has not been investigated. Therefore, the aim of this research is to evaluate the performance of MLS-SVR-based MEWMA control chart (Khusna et al., Citation2018) using ARL criteria. This research calculates the ARLs using simulation studies for several combinations of smoothing parameter and multivariate linear time series data, including VAR (1) and VMA (1) model. Moreover, this research is also aimed to provide an illustrative example of MLS-SVR-based MEWMA control chart for monitoring real autocorrelated data.

The rest of this paper is organized as follows. Section 2 describes MLS-SVR-based MEWMA control chart and the way the chart is constructed. ARL algorithm is displayed at Section 3. In addition, Section 4 presents the performance of MLS-SVR-based MEWMA control chart for VAR (1) and VMA (1) data. Application of the proposed method to monitor water quality data is displayed at Section 5. Finally, some concluding remarks and future research opportunity are given in Section 6.

2. MLS-SVR-based MEWMA control chart

2.1. MLS-SVR algorithm

The standard formula of LS-SVR trains the different output in multioutput case separately. Xu et al. (Citation2013) proposed MLS-SVR algorithm to cope the potential cross-relatedness among different output such that the relation between outputs can be captured by learning all outputs simultaneously. Let Y=[yij]Rn× are the observable outputs, where i=1,2,,n, j=1,2,,, i denotes number of samples and j for number of outputs. Suppose (x1,y1),(x2,y2),,(xn,yn) are the specific independent and identically distributed samples, where xiRd, d is dimension of input, and yiR.

In order to cover the Hierarchical Bayes intuition, the parameter wj can be written as wj=w0+vj. The parameter w0 describes mean vector whereas the small values of vector vj indicate that different outputs are similar to each other. In the other words, mean vector w0 carries commonality information while vector vj carries specialty information. Let φ:RdRh is a mapping function to higher dimensional Hilbert space with h dimension. All MLS-SVR parameters are assumed to be associated with φ(x) so that parameter wjRh(jN), vector vjRh(jN), and mean vector w0Rh.

The objective of MLS-SVR algorithm is to find the function f(x)=φ(x)TW+(b)T which has most deviation from the actual observable outputs and at the same time is as flat as possible. On the other words, the errors are acceptable as long as the errors are less than or equal to the deviation. The flatness of function f(x) can be obtained by looking for the small value of parameter W. This condition is equivalent with minimizing parameter w0Tw0 as well as traceVTV. Therefore, estimating vector w0, matrix V=v1,v2,,vRh×, and vector b=b1,b2,,bR simultaneously could be attained by minimizing the following objective function with constraints (Xu et al., Citation2013):

(1) minJ(w0,V,Ξ)=12w0Tw0+γ′′2traceVTV+γ2traceΞTΞ,s.tY=ZTW+repmat(bT,n,1)+Ξ,(1)

where matrix Z=φx1,φx2,,φxnRh×n, matrix Ξ=ξ1,ξ2,,ξR+n× contains slack variable, matrix W=w0+v1,w0+v2,,w0+vj,w0+vRh× illustrates MLS-SVR parameter, and γ,γ′′R+ are regularized parameter.

The optimization problem has the assumption that it should be feasible, the function f(x) actually exists for all approximates samples (x1,y1),(x2,y2),,(xn,yn) with precision equal to an acceptable deviation. A matrix Ξ containing slack variable is involved in optimization problem (1) in order to cope the infeasible constraints. In addition, the regularized parameters γ and γ′′ describe the trade-off between the flatness of function f(x) and the amount up to which errors larger than deviation are tolerated.

A set of dual variables is introduced in order to construct a Lagrange function from the objective function along with its constraints. The Lagrange function from the optimization problem (1) can be written as follows:

where A=α1,α2,,αTRn× contains Lagrange multiplier. According to Karush–Kuhn–Tucker method, a set of linear equation in Equation (2) yields an optimal condition.

(2) w0L=0w0=j=1Zαk,VL=0V=γ′′ZA,bL=0AT1=0,ΞL=0A=(γ)Ξ,AL=0ZTW+repmatbT,n,1+ΞY=0n×.(2)

Following the idea of LS-SVR, the linear equation system (3) is obtained by eliminating WandΞ matrices from Equation (2).

0n×NTNMbα=0y,

where matrix N=blockdiag1n,1n,,1nRn×, the positive definite matrix M=Ω+(γ)1In+/γ′′QRn×n, matrix Ω=repmatK˜,,Rn×n, matrix Q=blockdiagK˜,K˜,,K˜Rn×n, matrix K˜=ZTZRn×n contains kernel function, matrix α=α1T,α2T,,αTTRn contains Lagrange multiplier, and output variables are defined with y=y1T,y2T,,yTTRn. Hence, linear equation system (3) contains (n+1)× equations.

However, solving linear equation system (3) directly is difficult because it is not positively definite. For this reason, linear equation system (3) is reformulated into the following one:

(4) G0n×n0×MbM1Nb+α=NTM1yy,(4)

where G=NTM1NR×. It can be approved that G is a positive definite matrix. Thus, the solutions of linear equation system (4) can be obtained using following steps:

  1. Solve ϑ and ν from Mϑ=N and Mν=y.

  2. Calculate G=NTϑ.

  3. Find solution from b=G1ϑTy and α=νϑb.

The solutions of the linear equation system (4) can be obtained by solving two linear equations which have same positive definite matrix M. Singular value decomposition is an alternative method to solve those linear equations. Supposed that α˜=α˜1T,α˜2T,,α˜TT and b˜ are the solutions of linear equation system (4), then the decision function of MLS-SVR can be written as follows:

f˜(X)=φ(X)TW˜+b˜T=φ(X)Trepmatw˜0,1,+φ(X)TV˜+b˜T
=φ(X)Trepmatk=1Zα˜k,1,+γ′′φ(X)TZA˜+b˜T
(5) = repmatj=1i=1nα˜ijk˜X,Xi,1,+γ′′i=1nα˜iKX,Xi+b˜T(5)

Grid search method (Hsu, Chang, & Lin, Citation2016) is used to identify the proper hyper-parameters of MLS-SVR. For all possible combinations of kernel function parameter, σ215,213,,23, as well as regularized parameters, γ′′210,28,,210 and γ25,23,,215, the optimal pair of hyper-parameters σ, γ, and γ is selected based on minimum criterion from average of mean square error (MSE) over each output. An evolutionary algorithm as proposed by Härdle, Prastyo, and Hafner (Citation2014) can also be employed to optimize SVR parameters. This might become useful for future work.

2.2. Residual-based MEWMA control chart

Let y1,y2,,y are the observable outputs, where yj=(y1j,y2j,,ynj)T with j=1,2,,, and denotes number of outputs that satisfy multivariate autocorrelated processes. Supposed that each output has significant partial autocorrelation function until lag-p1,p2,,p then the inputs of MLS-SVR are selected as follows:

(6) x=y1,(i1),,y1,(ip1),y2,(i1),,y2,(ip2),,y,(i1),,y,(ip).(6)

The decision function of MLS-SVR model with optimal parameters fˆ(x) is calculated using Equation (5) then the residuals are calculated as eij=yijfˆ(xij). Hence, n× residual matrix e consists of eij, i=1,2,,n, j=1,2,,.

Lowry, Woodall, Champ, and Rigdon (Citation1992) introduced MEWMA control chart to monitor small change in the mean vector of independent observation. If the observation violates the independent assumption, then residual-based MEWMA control chart is employed. Supposed that ei=(ei1,ei2,,eij,,ei) is vector of residual from ith observation. This vector of residual is transformed to Zi as follows:

(7) Zi=λei+(1λ)Zi1(7)

where λ is smoothing parameter with 0 < λ < 1 and initial value Z0=0. The transformed value Zi in Equation (7) is used to construct following MLS-SVR-based MEWMA statistic:

(8) Ti2=ZiTΣzi1Zi(8)

where inverse covariance matrix of Zi is calculated by ΣZi=λ2λ11λ2iΣ. In addition, Σ is covariance matrix of observation whereas Ti2 in Equation (8) denotes MLS-SVR-based MEWMA statistic. The process is said to be in-control if Ti2 statistic is not greater than the upper control limit (UCL), H, which refers to Prabhu and Runger (Citation1997). Under in-control condition, the residuals of MLS-SVR model using the proper inputs selected using Equation (6) and the optimal parameters obtained as in Equation (5) are assumed to follow independent observation. Thus, the UCL of MLS-SVR residual-based MEWMA control chart is equal to the UCL of MEWMA control chart.

3. ARL algorithm

ARL is one of the well-known methods to measure and to compare the control chart performance. Run length (RL) is defined as the number of observations until the first observation is detected outside the control limits. The control chart performance could be assessed by two types of ARLs that are ARL0 and ARL1. The ARL0 or in-control ARL is defined as the expected number of observations before an out-of control observation is detected when the process is actually in-control. The ARL1 or out-of control ARL is the expected number of samples before an out of control signal is received when the process is actually shifted to an out-of-control state (Montgomery, Citation2009). For a fixed value of ARL0, a chart is considered to be more effective than other charts if it has a smaller ARL1.

There are some general methods to calculate the ARL of a control chart such as simulation, integral equation, and Markov chain approximation. Those methods had been developed by several researchers to calculate the ARL of MEWMA control chart. Lowry et al. (Citation1992) and Linderman and Love (Citation2000a, Citation2000b) calculated the ARL of MEWMA control chart using simulation. Rigdon (Citation1995) and Bodden and Rigdon (Citation1999) acquainted integral equation approximation to calculate the ARL of MEWMA chart. Furthermore, a Markov chain approximation for calculating the ARL of MEWMA chart is proposed by Runger and Prabhu (Citation1996) as well as Molnau et al. (Citation2001).

The ARLs of a control chart for time series data have a complex calculation. Śliwa and Schmid (Citation2005) confirmed that even for univariate case, there is no explicit formula to calculate the ARL of a control chart for time series data. Kramer and Schmid (Citation1997) and Śliwa and Schmid (Citation2005) utilized extensive Monte Carlo simulation to determine the ARL of MEWMA control chart for time series data. In this research, the ARL of MLS-SVR-based MEWMA control chart is calculated using simulation study based on Algorithm 1.

Algorithm 1. The ARL of MLS-SVR-based MEWMA Control Chart

  • 1. Specify the number of variables and smoothing parameter λ.

  • 2. Define the UCL H for specified parameters given in step 1.

  • 3. Specify the parameter of multivariate linear time series data Φp or Θq.

  • 4. Generate multivariate linear time series data using parameter Φp or Θq given in step 3 and the residuals which satisfy multivariate normal distribution with specified mean vector μa=0 and covariance matrix Σ=I.

  • 5. Train generated data resulted from step 4 using MLS-SVR algorithm, where the inputs are selected using Equation (6) and the optimal parameters along with the optimal hyper-parameters are obtained as in Equation (5).

  • 6. For C repetitions, follow these steps:

  • a. Generate n samples of multivariate time series data using parameter Φp or Θq given in step 3 and the residuals which have mean vector μa=0 and covariance matrix Σ=I.

  • b. Train the data which are generated in step 6(a) using inputs, parameters, and hyper-parameters obtained from step 5.

  • c. Calculate statistics Ti2, i=1,2,,n using Equation (8).

  • d. Calculate the RL, number of samples until finding the first statistic Ti2 that greater than H.

  • 7. Calculate ARL0 from the average of RL over C repetitions.

  • 8. For C′′ repetitions, where μa=μa+mean vector shift, follow these steps:

  • a. Repeat step 6 which multivariate time series data are generated with residuals that have mean vector μa and covariance matrix Σ=I.

  • b. Calculate ARL1 from the average of RL over C′′ repetitions.

  • 9. Repeat step 1 until 7 for different , λ and the associated H.

  • 10. Plot the ARL0 and ARL1 over the mean shift.

4. Performance evaluation of the proposed control chart

This section is aimed to investigate the performance of the proposed control chart using ARL. For this purpose, simulation studies are designed to generate multivariate linear time series data which follow VAR (1) or VMA (1) model using this mathematical expression:

(9) VAR (1):Yi=μ+Φ1(Yi1μ)+ai,(9)
(10) VMA(1) :Yi=μ+aiΘ1ai1,(10)

where ai is white noise residual which follows multivariate normal distribution with zero mean vector, μa=0, and covariance matrix Σ=I. The performance of the proposed control chart is evaluated for =2 and =3 quality characteristics. For =2 quality characteristics, the mean vector is set to be μ=μ1μ2T=510T. Otherwise, the mean vector is equal to μ=μ1μ2μ3T=5107T. The parameters of VAR (1) and VMA (1) used in the simulation studies are shown in Table .

Table 1. The parameter of VAR (1) and VMA (1)

The simulation studies to calculate the ARL of the proposed control chart are carried out based on Algorithm 1 with n=1000 samples, C=1,000 repetitions and C′′=14 repetitions. These simulation studies utilize the UCL which is chosen for in-control ARL equal to 200. The in-control ARL of the proposed control chart is calculated using generated in-control data which follow VAR (1) or VMA (1) model, whereas the out-of-control ARL is obtained by adding mean shift 0.2 to each residual of in-control data, i.e. μa=μa+ mean vector shift. The multivariate linear time series data which follow VAR (1) model are modelled using MLS-SVR algorithm, where Y1(i1) and Y2(i1) are selected as inputs, whereas the generated data which follow VMA (1) model are modelled using Y1(i1),Y1(i2),,Y1(i10) and Y1(i1),Y2(i2),,Y2(i10) as the inputs in MLS-SVR algorithm. The selection of the input is based on the significance lag of partial autocorrelation function. The radial basis function (RBF) kernel function is employed in MLS-SVR modelling.

Tables and present the ARL of the proposed control chart for VAR (1) data with =2 and =3, respectively. For various smoothing parameters λ, MLS-SVR-based MEWMA control chart yields stable ARL0 at about 200. The ARL1 is increasing when the magnitude of smoothing parameter λ is rising. In other words, the probability of the proposed control chart to detect a specified shift in the mean is decreasing as the increasing of the magnitude of smoothing parameter λ. Moreover, the ARL1 is decreasing as the increasing of the mean shift. This indicates that the larger the mean shift, the faster ability of the proposed control chart to detect those actual shift.

Table 2. The ARL of the proposed control chart for VAR (1) data with =2

Table 3. The ARL of the proposed control chart for VAR (1) data with =3

The ARLs of the proposed control chart for VMA (1) data with =2 and =3 are shown by Tables and , respectively. The proposed control chart tends to yield ARL0 at about 200. The ARLs of the proposed control chart for VMA (1) have similar pattern with those for VAR (1) data. If the mean of each variable has shifted from 0 to 0.2, then the ARL of the proposed control chart with smoothing parameter λ = 0.2 for monitoring VMA (1) with  = 2 variables decreases from 200 to about 150. However, the ARL of the proposed control chart for monitoring VMA (1) data with  = 3 variables diminishes from 196 to about 139 to detect that shift.

Table 4. The ARL of the proposed control chart for VMA (1) data with =2

Table 5. The ARL of the proposed control chart for VMA (1) data with =3

The ARLs comparison for VAR (1) and VMA (1) data is exhibited at Figure . The smaller the smoothing parameter λ, the faster ability of the proposed control chart to detect an actual mean shift of a process. If the mean shift of each variable has shifted from 0 to 0.2, then the ARL of the proposed control chart for monitoring VAR (1) data with λ = 0.05 diminishes to about 50. Otherwise, the ARL of the proposed control chart for monitoring VMA (1) data with λ = 0.05 decreases to about 100 to detect the same mean shift of a process. This indicates that the proposed control chart for monitoring VAR (1) data is more sensitive than that for monitoring VMA (1) data. Moreover, the plots of ARL for monitoring VMA (1) data with λ = 0.05 are significantly different with those with λ = 0.8. Hence, the performance of the proposed control chart to detect mean shift of VMA (1) data depends much on smoothing parameter λ. On the contrary, the smoothing parameter λ does not play a significant role to the ARL of the proposed control chart for VAR (1) data.

Figure 1. The ARLs comparison of the proposed control chart for (a) VAR (1) and (b) VMA (1) data.

Figure 1. The ARLs comparison of the proposed control chart for (a) VAR (1) and (b) VMA (1) data.

5. Application

In this section, the proposed control chart is applied to monitor water quality data in Surabaya, Indonesia. Two important quality characteristics in drinking water manufacturing process are water turbidity and chlorine residual. Turbidity is measured by the concentration of dissolution and the presence of particles in a liquid using Nephelometric Turbidity Units (NTU). Chlorination or affixing chlorine in to contaminated water is intended primarily for microbes killing, where chlorine residual is measured in ppm unit. Drinking water would be safe from bacteria if it has minimum chlorine residual of 0.2 ppm. However, excessive chlorine addition would affect the smell and the taste of water. Figure displays time series plots of water turbidity and chlorine residual for Phase I monitoring process. Those variables are measured hourly from 19 August 2016 until 29 August 2016.

Figure 2. Time series plots of (a) hourly water turbidity data and (b) hourly chlorine residual data in Phase I.

Figure 2. Time series plots of (a) hourly water turbidity data and (b) hourly chlorine residual data in Phase I.

Water quality data displayed at Figure are modelled using MLS-SVR algorithm, where Y1(i1),Y1(i2),Y1(i7),Y1(i8) and Y2(i1),Y2(i3),Y2(i9) are selected as input variables. These inputs are chosen by considering the significant lags of partial autocorrelation function. MLS-SVR modelling using RBF kernel function yields the optimal combination of hyper parameters γ=25, γ′′=28, and σ=23. The residuals of MLS-SVR model satisfy multivariate normal distribution and white noise condition with minimum MSE equal to 0.000605. Furthermore, these residuals are monitored using MEWMA control chart with significance level α=0.005 and smoothing parameter λ=0.2 as displayed at Figure . It can be shown that MLS-SVR-based MEWMA control chart for Phase I monitoring process fulfils in-control condition. Thus, the optimal combination of hyper parameters and the control limit can be applied in Phase II monitoring process.

Figure 3. MLS-SVR-based MEWMA control chart for water quality data in Phase I.

Figure 3. MLS-SVR-based MEWMA control chart for water quality data in Phase I.

Time series plots of water turbidity and chlorine residual data for Phase II monitoring process are displayed at Figure . Both variables are measured in 5 days starting from 30 August 2016. It can be known that the time series plots for first four days in Phase II monitoring process follow the same pattern as the time series plots of water quality data in Phase I. However, time series plots for last day in Phase II monitoring process, starting from 10.00 AM, show unusual pattern. Time series plot of water turbidity indicates increasing pattern while time series plot of chlorine residual exhibits steep shift pattern.

Figure 4. Time series plots of (a) hourly water turbidity data and (b) hourly chlorine residual data in Phase II.

Figure 4. Time series plots of (a) hourly water turbidity data and (b) hourly chlorine residual data in Phase II.

The water quality data displayed at Figure are then modelled using MLS-SVR algorithm using the optimal combination of hyper parameters obtained from Phase I monitoring process. The MLS-SVR-based MEWMA control chart for Phase II monitoring process is shown at Figure . The residuals resulted from Phase II monitoring process do not satisfy white noise condition. Consequently, it leads to many false alarms and tight control limit. The increasing pattern far away from control limit in last eight observations indicates a serious problem in water quality data. Hence, monitoring water quality data using MLS-SVR-based MEWMA control chart becomes an early warning such that water manufacturing process needs to be improved. Looking for the assignable causes is needed in order to improve the manufacturing process. In 3 September 2016, leak pipeline is found in one of the water distribution area. Moreover, pipeline maintenance process and flow meter installation are started at 08.00 PM. As a result, both water turbidity and chlorine residual plots indicate unusual pattern.

Figure 5. MLS-SVR-based MEWMA control chart for water quality data in Phase II.

Figure 5. MLS-SVR-based MEWMA control chart for water quality data in Phase II.

The MLS-SVR-based MEWMA control chart for water quality data needs to be updated in order to fit the next Phase I monitoring process. This step could be solved by replacing original unusual pattern with predicted value as displayed at Figure . Data displayed at Figure are then modelled using MLS-SVR algorithm as testing data. Furthermore, white noise residuals resulted from this process are monitored using MEWMA control chart. Figure shows MLS-SVR-based MEWMA control chart for updated water quality data. All of the observations fulfil in-control condition such that this control chart could be fitted as the next Phase I monitoring process.

Figure 6. Time series plots of (a) updated hourly water turbidity data and (b) updated hourly chlorine residual data in Phase II.

Figure 6. Time series plots of (a) updated hourly water turbidity data and (b) updated hourly chlorine residual data in Phase II.

Figure 7. MLS-SVR-based MEWMA control chart for updated water quality data in phase II.

Figure 7. MLS-SVR-based MEWMA control chart for updated water quality data in phase II.

6. Conclusions and future research

The performance of MLS-SVR-based MEWMA control chart for monitoring multivariate linear time series data which follow VAR (1) and VMA (1) model is investigated using ARL criteria. The smaller the smoothing parameter, the more sensitive of the proposed control chart to detect an actual mean shift of a process. For a certain value of smoothing parameter, the proposed control chart for monitoring VAR (1) data is more sensitive than that for monitoring VMA (1) data. The ARLs of the proposed control chart for monitoring VMA (1) data depend much on the smoothing parameter. In addition, monitoring hourly water quality data using MLS-SVR-based MEWMA control chart becomes an early warning to improve drinking water manufacturing process if the actual assignable causes are occurred. Developing the theoretical ARL formula for MLS-SVR-based MEWMA control chart is one of the open future research topics. Bootstrap resampling method (Khusna, Mashuri, Ahsan, Suhartono, & Prastyo, Citation2018) and kernel density estimation (Ahsan, Mashuri, Kuswanto, Prastyo, & Khusna, Citation2018a, Citation2018b) are two computational techniques which can be used to develop the more accurate control limit of the proposed chart. Furthermore, the mixed monitoring scheme as demonstrated by Ahsan, Mashuri, Kuswanto, Prastyo, and Khusna (2018b) can also be considered for future development of this proposed chart.

Correction

This article has been republished with minor changes. These changes do not impact the academic content of the article.

Acknowledgements

The authors thank to the referees and the editor for their valuable comments and suggestions that have improved this manuscript to a better scientific level.

Additional information

Funding

This work was supported by Research, Technology, and Higher Education Ministry, Republic of Indonesia through PMDSU scheme: [Grant Number 128/SP2H/PTNBH/DRPM/2018].

Notes on contributors

Muhammad Mashuri

The corresponding author, Dr. Muhammad Mashuri, is an associate professor in the Department of Statistics, Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia. His research interest includes statistical quality control and multivariate analysis, especially in industrial field. Dr. Suhartono and Dr. Dedy Dwi Prastyo are senior lecturers whose research interest includes time series analysis as well as machine learning. Hidayatul Khusna and Muhammad Ahsan are PhD students through Master Program of Education Leading to Doctoral Degree for Excellent Graduates (PMDSU), an accelerated programme for undergraduate prepared to be candidate lecturers or researchers with doctoral degrees.

References

  • Ahsan, M. , Mashuri, M. , Kuswanto, H. , Prastyo, D. D. , & Khusna, H. (2018a). T2 Control Chart based on Successive Difference Covariance Matrix for Intrusion Detection System. In Journal of Physics: Conference Series (Vol. 1028, p. 12220). IOP Publishing
  • Ahsan, M.,Mashuri, M., Kuswanto, H., Prastyo, D. D., & Khusna, H. (2018b). Multivariate Control Chart based on PCA Mix for Variable and Attribute Quality Characteristics.Production & Manufacturing Research, 6(1), 364–384. doi: 10.1080/21693277.2018.1517055
  • Alwan, L. C. , & Roberts, H. V. (1988). Time-Series process modeling for statistical control. Journal of Business Economics and Statistics , 6(1), 87–95.
  • Bagshaw, M. , & Johnson, R. A. (1975). The effect of serial correlation on the performance of CUSUM tests II. Technometrics , 17(1), 73–80. doi:10.1080/00401706.1975.10489274
  • Bodden, K. M. , & Rigdon, S. E. (1999). A program for approximating the in-control ARL for the MEWMA chart. Journal of Quality Technology , 31(1), 120–123. doi:10.1080/00224065.1999.11979902
  • Böhm, W. , & Hackl, P. (1996). The effect of serial correlation on the in-control average run length of cumulative score charts. Journal of Statistical Planning and Inference , 54(1), 15–30. doi:10.1016/0378-3758(95)00153-0
  • Härdle, W. K. , Prastyo, D. D. , & Hafner, C. M. (2014). Support vector machines with evolutionary model selection for default prediction. In The oxford handbook of applied nonparametric and semiparametric econometrics and statistics . New York, NY: Oxford University Press.
  • Heskes, T. M. (2000). Empirical Bayes for Learning to Learn . San Francisco: Morgan Kaufmann.
  • Hsu, C. W. , Chang, C. C. , & Lin, C. J. (2016). A practical guide to support vector classication . Taipei: National Taiwan University.
  • Issam, B. K. , & Mohamed, L. (2008). Support vector regression based residual MCUSUM control chart for autocorrelated process. Applied Mathematics and Computation , 201(1–2), 565–574. doi:10.1016/j.amc.2007.12.059
  • Jarrett, J. E. , & Pan, X. (2007). The quality control chart for monitoring multivariate autocorrelated processes. Computational Statistics and Data Analysis , 51(8), 3862–3870. doi:10.1016/j.csda.2006.01.020
  • Johnson, R. A. , & Bagshaw, M. (1974). The effect of serial correlation on the performance of CUSUM tests. Technometrics , 16(1), 103–112. doi:10.1080/00401706.1974.10489155
  • Khediri, I. B. , Weihs, C. , & Limam, M. (2010). Support vector regression control charts for multivariate nonlinear autocorrelated processes. Chemometrics and Intelligent Laboratory Systems , 103(1), 76–81. doi:10.1016/j.chemolab.2010.05.021
  • Khusna, H., Mashuri, M., Ahsan, M., Suhartono., & Prastyo, D. D. (2018). Bootstrap based maximum multivariate CUSUM control chart. Quality Technology & Quantitative Management. https://doi.org/10.1080/16843703.2018.1535765
  • Khusna, H., Mashuri, M., Suhartono, Prastyo, D. D., & Ahsan, M. (2018). Multioutput least square SVR based multivariate EWMA control chart. Journal of Physics: Conference Series, 1028, 12221. IOP Publishing.
  • Kramer, H. G. , & Schmid, L. V. (1997). Ewma charts for multivariate time series. Sequential Analysis , 16(2), 131–154. doi:10.1080/07474949708836378
  • Linderman, K. , & Love, T. E. (2000a). Economic and economic statistical designs for MEWMA control charts. Journal of Quality Technology , 32(4), 410–417. doi:10.1080/00224065.2000.11980026
  • Linderman, K. , & Love, T. E. (2000b). Implementing economic and economic statistical designs for MEWMA charts. Journal of Quality Technology , 32(4), 457–463. doi:10.1080/00224065.2000.11980032
  • Lowry, C. A. , Woodall, W. H. , Champ, C. W. , & Rigdon, S. E. (1992). A multivariate exponentially weighted moving average control chart. Technometrics , 34(1), 46–53. doi:10.2307/1269551
  • Lu, C. W. , & Reynolds, M. R. (1999). Control charts for monitoring the mean and variance of autocorrelated processes. Journal of Quality Technology , 31(3), 259–274. doi:10.1080/00224065.1999.11979925
  • Molnau, W. E. , Runger, G. C. , Montgomery, D. C. , Skinner, K. R. , Loredo, E. N. , & Prabhu, S. S. (2001). A program for ARL calculation for multivariate EWMA charts. Journal of Quality Technology , 33(4), 515–521. doi:10.1080/00224065.2001.11980109
  • Montgomery, D. C. (2009). Introduction to statistical quality control . New York, NY: John Wiley & Sons.
  • Montgomery, D. C. , & Mastrangelo, C. M. (1991). Some statistical process-control methods for autocorrelated data. Journal of Quality Technology , 23(3), 179–193. doi:10.1080/00224065.1991.11979321
  • Noorossana, R. , & Vaghefi, S. J. M. (2006). Effect of autocorrelation on performance of the MCUSUM control chart. Quality and Reliability Engineering International , 22(2), 191–197. doi:10.1002/(ISSN)1099-1638
  • Pan, X. , & Jarrett, J. (2007). Using vector autoregressive residuals to monitor multivariate processes in the presence of serial correlation. International Journal of Production Economics , 106(1), 204–216. doi:10.1016/j.ijpe.2006.07.002
  • Prabhu, S. S. , & Runger, G. C. (1997). Designing a multivariate EWMA control chart. Journal of Quality Technology , 29(1), 8. doi:10.1080/00224065.1997.11979720
  • Rigdon, S. E. (1995). An integral equation for the in-control average run length of a multivariate exponentially weighted moving average control chart. Journal of Statistical Computation and Simulation , 52(4), 351–365. doi:10.1080/00949659508811685
  • Runger, G. C. , & Prabhu, S. S. (1996). A markov chain model for the multivariate exponentially weighted moving averages control chart. Journal of the American Statistical Association , 91(436), 1701–1706. doi:10.1080/01621459.1996.10476741
  • Runger, G. C. , Willemain, T. R. , & Prabhu, S. (1995). Average run lengths for cusum control charts applied to residuals. Communications in Statistics - Theory and Methods , 24(1), 273–282. doi:10.1080/03610929508831487
  • Śliwa, P. , & Schmid, W. (2005). Monitoring the cross-covariances of a multivariate time series. Metrika , 61(1), 89–115. doi:10.1007/s001840400326
  • Suykens, J. A. K. , Gestel, T. V. , Brabanter, J. D. , Moor, B. D. , & Vandewalle, J. (2002). Least squares support vector machines . Singapore: World Scientific.
  • Suykens, J. A. K. , & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters , 9(3), 293–300. doi:10.1023/A:1018628609742
  • Vanbrackle, L. N. , & Reynolds, M. R. (1997). EVVMA and cusum control charts in the presence of correlation. Communications in Statistics - Simulation and Computation , 26(3), 979–1008. doi:10.1080/03610919708813421
  • Vapnik, V. N. (1995). The nature of statistical learning theory . New York, NY: Springer.
  • Vapnik, V. N. (1998). Statistical learning theory . New York, NY: John Wiley & Sons.
  • Wardell, D. G. , Moskowitz, H. , & Plante, R. D. (1994). Run-length distributions of special-cause control charts for correlated processes. Technometrics , 36(1), 3–17. doi:10.1080/00401706.1994.10485393
  • Xu, S. , An, X. , Qiao, X. , Zhu, L. , & Li, L. (2013). Multi-output least-squares support vector regression machines. Pattern Recognition Letters , 34(9), 1078–1084. doi:10.1016/j.patrec.2013.01.015
  • Yashchin, E. (1993). Performance of CUSUM control schemes for serially correlated observations. Technometrics , 35(1), 37–52. doi:10.1080/00401706.1993.10484992