Multioutput LS-SVR based residual MCUSUM control chart for autocorrelated process

(1)

Multioutput LS-SVR based residual MCUSUM control chart for autocorrelated process ^†

Changha Hwang ¹

1 Department of Applied Statistics, Dankook University

Received 15 January 2016, revised 12 February 2016, accepted 16 February 2016

Abstract

Most classical control charts assume that processes are serially independent, and autocorrelation among variables makes them unreliable. To address this issue, a variety of statistical approaches has been employed to estimate the serial structure of the process. In this paper, we propose a multioutput least squares support vector regression and apply it to construct a residual multivariate cumulative sum control chart for detecting changes in the process mean vector. Numerical studies demonstrate that the proposed multioutput least squares support vector regression based control chart provides more satisfying results in detecting small shifts in the process mean vector.

Keywords: Autocorrelated process, least squares support vector regression, multioutput regression, multivariate cumulative sum control chart, residual control chart, statistical process.

1. Introduction

The multivariate process control is known as one of the rapidly expanding statistical pro- cess controls, which is very appealing, but to construct a control chart the assumption of the independency of serial data points must be satisfied. The problem is that when that assumption is not satisfied the performance of the chart falls off, that is, breaking the in- dependency assumption affects the average run length of the control charts and results in unreliable interpretations (Harris and Ross, 1991; Noorossana and Vaghefi, 2006). For mon- itoring autocorrelated processes, Chan and Li (1994) proposed to improve the multivariate Hotelling’s T ² charts for autocorrelated processes. Hunter (1998) proposed the process ad- justment scheme in which the structure of time series is estimated and the residuals term is used to control the process. Orlando et al. (2002) proposed a residual univariate cumulative sum charts for autocorrelated data. Callao and Rius (2003) showed that residual control charts provide better understanding of the system pattern over time and efficient shift de- tection abilities. Snoussi et al. (2005) showed that residuals control charts have the ability to provide better shift detection than observation control charts.

In many previous studies the parameters and residuals in the time series are estimated by classical statistical methods, but underlying assumptions are not usually satisfied in real

† The present research was conducted by the research fund of Dankook university in 2015.

1

Professor, Department of Applied Statistics, Dankook University, Yongin 16890, Korea.

E-mail: [email protected]

(2)

data problems. Recently alternative methods (Dooley and Guo, 1992; Jamal et al., 2007;

Khediri and Liman, 2008; Khediri et al., 2010) based on the machine learning methods such as artificial neural networks (Lawrence, 1994) and support vector machines (Vapnik, 1995) have been proposed, which are less restrictive to assumptions and more adaptable to real data problems than classical statistical methods.

SVM, first developed by Vapnik (1995) and his group at AT&T Bell Laboratories, has been successfully applied to numbers of real world problems related to classification and regression problems. Least squares SVM (LS-SVM) is a least squares version of SVM and was initially introduced by Suykens and Vanderwalle (1999). LS-SVM has been proved to be a very appealing and promising method. There are some strong points of LS-SVM. One is that LS-SVM uses the linear equation which is simple to solve and good for computational time saving. Another is that LS-SVM makes the model selection easier by using the generalized cross validation function. Introductions and overviews of recent developments of SVM and LS-SVM can be found in Smola and Scholkopf (1998), Suykens et al. (2001), Hwang (2014, 2015), Seok (2015), Shim and Hwang (2014), and Shim and Seok (2014).

In this paper we propose the multioutput least squares support vector regression (MLS- SVR) which has a different approach from Xu et al. (2013) since we consider the covariance of error terms, and use it for the construction of the residuals control chart. Application of the proposed method is performed by controlling the residuals for a multivariate autore- gressive process of order one. The rest of this paper is organized as follows. In Section 2, we briefly review the autoregressive residual multivariate cumulative sum (MCUSUM) control charts. In Section 3 we propose MLS-SVR and apply it constructing a residual MCUSUM control chart. In Section 4 we perform the experimental study of residual control charts with simulated data sets. In Section 5 we give the conclusions.

2. Autoregressive residual MCUSUM control charts

For the multivariate process which is characterized by autocorrelation in each variable, evaluation and monitoring of the system can be constructed by using autoregressive mod- els, which leads to control residuals as a serially independent multivariate series. Using an autoregressive model to approximate a linear time series system is appropriate due to the physical principles of the process dynamics. For a multivariate autoregressive process of m variables at time t, denote y _t = (y t1 , · · · , y tm ) ⁰ . If these variables contain autocorrelation of order p, we have

y _t = f (y _t−1 , · · · , y _t−p ) + e _t , where e _t is an error vector with zero means.

If we suppose f is linear and µ is a mean vector such that µ = E(y _t ) for t = 1, 2, · · · , y _t can be reexpressed as

y _t = µ + ρ 1 (y _t−1 − µ) + · · · + ρ p (y _t−p − µ) + e t , Then the predicted y _t and residual vector r t are obtained as follows:

y b _t = µ + ρ b 1 (y _t−1 − µ) + · · · + b ρ b p (y _t−p − µ) and r b t = y t − b y _t ,

(3)

Since µ = E(y _t ) for t = 1, 2, · · · , the mean residual vector m _r = 0 when the process is in control. MCUSUM scheme cumulates deviations more than k standardized units from the target mean value. Thus, k serves as the reference value of the scheme. Applying Healy’s procedure (1987), the following MCUSUM statistic S t allows to detect a specific shift in the process mean vector:

S t = max(S t−1 + a ⁰ r t − k, 0), (2.1) where a ⁰ _t = √ ^m

⁰^r

^Σ

⁻¹^r

m

⁰_r

Σ

⁻¹r

m

r

and Σ _r is the covariance matrix of r _t .

The control scheme signals an out-of-control situation when the value of S t is greater than a predetermined value H.

To evaluate the control chart parameters, the most popular statistics is the average run length which represents the average number of time points that must be less than a time point which indicates an out-of-control condition. Because estimation of large in-control average run lengths requires a long computation time, the value of in-control ARL is mostly set to be small which imply usually a nonacceptable high rate of false alarms (Khediri et al., 2010). Using the table of Lucas and Crosier (1982) the values of and can be obtained according to predetermined value of average run length.

3. Multioutput LS–SVR based residual control chart

3.1. Multioutput LS-SVR

Denote the training data set by (x, y) = {x i, y ij } ^n,m _i,j=1 , with each input x i ∈ R ^d and the output y ij ∈ R. We consider the case of nonlinear regression of the form,

f j (x i ) = ω ⁰ _j φ(x i ) + b j , (3.1) where the term b j is a bias term. Here the feature mapping function φ(·) : R ^d → R ^d

^f

maps the input space to the higher dimensional feature space where the dimension d f is defined in an implicit way. Using a weighted quadratic loss function we define the optimization problem for estimation of (ω j , b j ) as follows:

min 1 2

m

X

j=1

||ω j || ² + C 2

n

X

i=1 m

X

j,k=1

e _ij V _jk ⁻¹ e _ik

over {ω j , b j , e ij } subject to equality constraints,

e ij = y ij − ω ⁰ _j φ(x i ) − b j , i = 1, · · · , n, j = 1, · · · , m.

Here C > 0 is a penalty parameter, V is a covariance matrix of (e i1 , · · · , e im ) ⁰ and V _jk ⁻¹ (j, k = 1, · · · , m) is the (jk)th element of the inverse of V , V ⁻¹ .

The Lagrangian function can be constructed from the optimization problem above,

L = 1 2

m

X

j=1

||ω j || ² + C 2

n

X

i=1 m

X

j,k=1

e ij V _jk ⁻¹ e ik −

n

X

i=1 n

X

j=1

α ij (e ij − y ij + ω ⁰ _j φ(x i ) + b j ),

(4)

where α _ij is Lagrange multiplier. The optimality conditions are given by

∂L

∂ω j

= 0 → ω j =

n

X

i=1

α ij φ(x i ), j = 1, · · · , m,

∂L

∂b j

= 0 →

n

X

i=1

α ij = 0, j = 1, · · · , m,

∂L

∂e ij

= 0 → α _ij = C

m

X

k=1

V _jk ⁻¹ e _ik , i = 1, · · · , n, j = 1, · · · , m,

From the optimality the optimal Lagrange multipliers and the estimate of b j ’s can be obtained as the solutions to the linear equations as follows:

y ij − K(x i , x)α .j − b j − 1 C V j.





 α i1

α i2

.. . α im







= 0, i = 1, · · · , n, j = 1, · · · , m (3.2)

n

X

i=1

α ij = 0, j = 1, · · · , m, (3.3)

with α .j = (α 1j , · · · , α nj ) ⁰ , V j. = (V j1 , · · · , V jm ) and K(x i , x) = (K(x i , x 1 ), · · · , K(x i , x n )), where K(x i , x k ) = φ(x i ) ⁰ φ(x k ), k = 1, · · · , n, which are obtained from the application of Mercer’s conditions (1909). Several choices of the kernel K(·, ·) are possible.

Thus the optimal regression function for the given x _t is obtained as

f b j (x t ) =

n

X

i=1

K(x t , x i ) α b ij + b b j , j = 1, · · · , m. (3.4)

If y i1 , · · · , y im are assumed to be independent, then the linear equations (3.2) can be expressed as follows:

y ij − K(x i , x)α .j − b j − 1

C α ij = 0, i = 1, · · · , n, j = 1, · · · , m,

which leads that ( α b _.j , b b _j ) can be obtained by LS-SVR (Suykens and Vanderwalle, 1999) using (x, y _.j ) for j = 1, · · · , m.

In fact V _jk is not known, ( α b _ij , b b _j ) can not be obtained as the solution to the linear equations (3.2) and (3.3). We use the iterative method with the initial V = I _m as follows:

(i) Find ( α b ij , b b j ) from the linear equations (3.2) and (3.3) using the estimate of V jk obtained in the previous step.

(ii) Find the estimate of V jk by

V _jk = 1 n − 1

n

X

i=1

(y _ij − b f _j (x _i ))(y _ik − b f _k (x _i )), j, k = 1, · · · , m.

(5)

(iii) Iterate steps until convergence.

The functional structure of MLS-SVR is characterized by the penalty parameter and the kernel parameter. To determine the optimal values of the hyperparameters, we use a cross- validation (CV) technique, where the CV function is defined as follows:

CV (Λ) = 1 nm

n

X

i=1 m

X

j=1

(y ij − b f _j ⁽⁻ⁱ⁾ (x i )) ² ,

where Λ is a set of the penalty parameter and the kernel parameter and b f _j ⁽⁻ⁱ⁾ (x i ) is the estimated regression function corresponding to y ij , which is obtained without the i th ob- servation (x i , y ij ). Using the leave-out-one lemma (Wahba, 1990) and the first order Taylor expansion generalized cross validation function can be obtained as follows:

GCV (Λ) = nm

n

P

i=1 m

P

j=1

(y ij − b f j (x i )) ²

(nm − trace(H)) ² , (3.5)

where H is a hat matrix such that ( b f 1 (x 1 ), b f 1 (x 2 ), · · · , b f m (x n )) ⁰ = H(y 11 , y 12 , · · · , y mn )).

3.2. Multioutput LS-SVR based control chart

To perform time series estimation for a multivariate process by MLS-SVR, we define each input vector x t as the previous reponses of the series (y _t−1 , · · · , y _t−p ), where p is the lagged time. We have an autoregressive process with m variables and order p, which can be represented by the function as follows:

y tj = f j (x t ) + e tj = f j (y _t−1 , · · · , y _t−p ) + e tj , j = 1, · · · , m.

Then b y tj = b f tj (x t ) is obtained by (3.4) and the residual term (r t1 , · · · , r tm ) ⁰ = (y t1 − b y _t1 , · · · , y _tm − y b _tm ) ⁰ would follow time-independent normal distribution with zero means, which is used for the construction of control chart. If a shift is present, the process cannot be represented by f _j ’s and the residuals would be affected.

To estimate (α _.j , b _j )’s, it is necessary to train MLS-SVR algorithm on the in-control data set without a shift after model selection, and use the obtained multioutput LS-SVR structure to predict future reponses of the out-of-control data set to obtain the residual vector. For in- control situation, the mean vector of the residuals must be zero. Whereas, for out-of-control situation, the shift in the mean vector is reflected in the mean of the residual vector. To detect the specific shift in the out-of-control data set, we apply MCUSUM chart to control residuals by performing Healy’s procedure of (2.1). Run length is determined as t ^∗ where S t obtained from the out-of-control data is less than and equal to the predetermined H for t = 1, · · · , t ^∗ and S t

^∗

+1 > H.

4. Numerical studies

In this section we deal with implementation of MCUSUM chart based on MLS-SVR and

evaluation of its performance by comparing with the autoregressive process with order 1

(6)

(AR(1)) and LS-SVR based control charts. The nonlinear benchmark multivariate auto- correlated process used by Jamal et al. (2007), Ben Kheirdi and Liman (2008), Issam and Mohamed (2008) is modeled as follows:

y t1

y t2

= µ 1

µ 2

+ ρ 11 ρ 12

ρ 21 ρ 22

y t−1,1 − µ 1

y t−1,2 − µ 2

+ e t1

e t2

,

where ^µ _µ

¹

2

= 260 470

, ρ

11

ρ

12

ρ

21

ρ

22

= 0.0146 0.0177 0.6493 0.0958

and e

t1

e

t2

∼ N 0 0

, 99.91 63.99 63.99 69.52

. To generate the out-of-control data set, we add the shift vectors to the mean vector. We generate the in-control data set of size 300 and obtain the run length of the out-of-control data set with different shift vectors. We repeated this procedure 1000 times to obtain the average run length (ARL) of the out-of-control data set.

For a target of ARL equal to 205, the values of k and H in Healy’s procedure (1987) in (2.1) for the three charts, are given 0.75 and 2.5, respectively. Gaussian kernel is utilized in LS-SVR and MLS-SVR, which is given as follows:

K(x ₁ , x ₂ ) = exp − 1

σ ² ||x 1 − x 2 || ²

! .

Hyperparameters of LS-SVR and MLS-SVR are chosen by generalized cross validation function from the candidate sets of C s and σ ² s such as (100,200,300,400) and (0.25, 0.5, 0.75,1,1.5, 2).

To find MCUSUM statistic S t , the residual vector r t = (r t1 , r t2 ) ⁰ is needed. In AR(1) and LS-SVR, r tj is obtained from {y ij } ³⁰⁰ _i=1 in the in-control data set and y tj in the out-of-control data set for j = 1, 2. In MLS-SVR, (r t1 , r t2 ) are simultaneously obtained from {y i1 , y i2 } ³⁰⁰ _i=1 in the in-control data set and (y t1 , y t2 ) in the out-of-control data set.

Applying the above procedure, by adding a given shift vector to the in-control mean vector, we obtain the run length’s by three control charts for the out-of-control data set. In order to compare efficiently the performance of three charts, we use several shift vectors.

The in-control ARL of AR(1), LS-SVR and MLS-SVR based control charts are obtained as 203.073, 206.864 and 202.472, respectively, from 1000 in-control data sets of size 1000.

Results including the out-of-control ARL values for AR(1), LS-SVR and MLS-SVR based control charts are presented in Table 4.1. The boldfaced figure in each column signifies the smallest ARL.

Table 4.1 Out-of-control ARL’s by three control charts (standard error in parenthesis)

shift distance AR(1) LS-SVR MLS-SVR

1 [0.5;0.3] 0.050 180.172 (5.485) 61.860 (2.227) 55.198 (2.276) 2 [1;0.7] 0.101 115.164 (3.801) 57.158 (2.019) 40.721 (1.646) 3 [1.5;1] 0.150 104.652 (3.492) 46.251 (1.648) 38.592 (1.605) 4 [1.6;0] 0.250 87.082 (2.836) 13.345 (0.414) 14.842 (0.593) 5 [-1.35;1] 0.374 42.095 (1.383) 10.371 (0.321) 10.509 (0.396) 6 [2;-2.8] 0.790 16.637 (0.493) 6.819 (0.200) 5.925 (0.205) 7 [-8;0] 1.249 3.834 (0.081) 3.250 (0.095) 2.614 (0.080) 8 [6.6;-7.5] 2.293 1.561 (0.036) 1.719 (0.054) 1.214 (0.039) 9 [-9;10] 3.085 0.679 (0.022) 0.927 (0.037) 0.666 (0.026) 10 [14;-14] 4.522 0.091 (0.009) 0.208 (0.015) 0.199 (0.017)

From Table 4.1 and Figure 4.1, we can see that MLS-SVR based control chart provides

the best performance in detecting small changes in the process mean vector. As an example,

when the mean vector shifts by [1.5;1], the out-of-control ARL’s of AR(1), LS-SVR and

(7)

MLS-SVR based control charts are 104.652, 46.251 and 38.592, respectively. Whereas the difference between the out-of-control ARL’s of the control charts is insignificant for larger shifts. This result can be explained by the fact that the solution of the multioutput LS-SVR assumes that responses at each time are correlated.

Figure 4.1 Out-of-control ARL’s by three control charts. Dotted lines: AR(1) based, dashed lines: LS-SVR based, solid lines: MLS-SVR based.

5. Conclusion

In this paper we proposed a multioutput LS-SVR based residuals MCUSUM control chart to detect changes in the process mean vector when there exist autocorrelations between variables, which is a main advantage over LS-SVR based control chart. And the main ad- vantage of this chart over the autoregressive process based control chart is that it can handle nonlinear relations between the variables.

Results of the three control charts in the numerical studies show that the multioutput LS-SVR based control chart is more effective in detecting small shifts in the mean vector.

Whereas for larger shifts, the difference in the performance of three charts is insignificant.

This implies that the proposed chart is a promising method since the MCUSUM chart is usually constructed to detect small shifts in the process means..

References

Callao, M. P. and Rius, A. (2003). Time series: a complementary technique to control charts for monitoring analytical systems. Chemometrics and Intelligent Laboratory Systems, 66, 79-87.

Chan, L. K. and Li, G. Y. (1994). A multivariate control chart for detecting linear trends. Communications in Statistics: Simulation and Computation, 23, 997-1012.

Dooley, J. and Guo, K. (1992). Identification of change structure in statistical process control. International Journal of Production Research, 30, 1655-1669.

Harris, T. J. and Ross, W. H. (1991). Statistical process control procedures for correlated observations.

Canadian Journal of Chemical Engineering, 69, 48-57.

Healy, J. D. (1987). A note on multivariate CUSUM procedures. Technometrics, 29, 409-412.

(8)

Hunter, J. S. (1998). The Box-Jenkins manual adjustment chart. Quality Progress, 31, 129-137.

Hwang, H. (2014). Support vector quantile regression for autoregressive data. Journal of the Korean Data

& Information Science Society, 25, 1539-1547.

Hwang, H. (2015). Partially linear support vector orthogonal quantile regression with measurement errors.

Journal of the Korean Data & Information Science Society, 26, 209-216.

Issam, B. K. and Mohamed, L. (2008). Support vector regression based residual MCUSUM control chart for autocorrelated process. Applied Mathematics and Computation, 201, 565-574.

Jamal, A., Seyed, T. Akhavan, N. and Babak, A. (2007). Artificial neural networks in applying MCUSUM residuals charts for AR(1) processes. Applied Mathematics and Computation, 189, 1889-1901.

Khediri, I. B. and Limam, M. (2008). Support vector regression based residual MCUSUM control chart for autocorrelated process. Applied Mathematics and Computation, 201, 565-574.

Khediri, I. B., Weihs, C. and Limam, M. (2010). Support vector regression control charts for multivariate nonlinear autocorrelated processes. Chemometrics and Intelligent Laboratory Systems, 103, 76-81.

Lawrence, J. (1994). Introduction to Neural Networks, California Scientific Software Press, California.

Lucas, J. M. and Crosier, R. B. (1982). Fast initial response for cumulative sum quality control schemes.

Technometrics, 24, 199-205.

Mercer, J. (1909). Functions of Positive and Negative Type and Their Connection with Theory of Integral Equations. Philosophical transactions of the royal society of London. Series A, containing papers of a mathematical or physical character , 415-446.

Noorossana, R. and Vaghefi, S. J. M. (2006). Effect of autocorrelation on performance of the MCUSUM control chart. Quality and Reliability Engineering International , 22, 191-197.

Orlando, O., Atienza, L. C. and Tang, B. W. (2002). A CUSUM scheme for autocorrelated observations.

Journal of Quality Technology, 34, 187-199.

Seok, K. (2015). Semisupervised support vector quantile regression. Journal of the Korean Data & Infor- mation Science Society, 26, 517-524.

Shim, J. and Hwang, C. (2014).. Support vector quantile regression ensemble with bagging. Journal of the Korean Data & Information Science Society, 25, 677-684.

Shim, J. and Seok, K. (2014). A transductive least squares support vector machine with the difference convex algorithm. Journal of the Korean Data & Information Science Society, 25, 455-464.

Smola, A. and Scholkopf, B. (1998). On a kernel-based method for pattern recognition, regression, approx- imation and operator inversion. Algorithmica, 22, 211-231.

Snoussi, A., Ghourabi, M. E. and Limam, M. (2005). On SPC for short run autocorrelated data. Commu- nication in Statistics, Simulation and Computation, 34, 219-234.

Suykens, J. A. K. and Vanderwalle, J. (1999). Least square support vector machine classifier. Neural Pro- cessing Letters, 9, 293-300.

Suykens, J. A. K., Vanderwalle, J. and De Moor, B. (2001). Optimal control by least squares support vector machines. Neural Networks, 14, 23-35.

Wahba, G. (1990). Spline models for observational data, SIAM, Philadelphia.

Xu, S., An, X., Qiao, X., Zhu, L. and Li, L. (2013). Multi-output least-squares support vector regression

machines, Pattern Recognition Letters, 34, 1078-1084.

Multioutput LS-SVR based residual MCUSUM control chart for autocorrelated process

Multioutput LS-SVR based residual MCUSUM control chart for autocorrelated process †

Changha Hwang 1

1 Department of Applied Statistics, Dankook University

Received 15 January 2016, revised 12 February 2016, accepted 16 February 2016

Abstract

Keywords: Autocorrelated process, least squares support vector regression, multioutput regression, multivariate cumulative sum control chart, residual control chart, statistical process.

1. Introduction

In many previous studies the parameters and residuals in the time series are estimated by classical statistical methods, but underlying assumptions are not usually satisfied in real

† The present research was conducted by the research fund of Dankook university in 2015.

Professor, Department of Applied Statistics, Dankook University, Yongin 16890, Korea.

E-mail: [email protected]

data problems. Recently alternative methods (Dooley and Guo, 1992; Jamal et al., 2007;

2. Autoregressive residual MCUSUM control charts

y t = f (y t−1 , · · · , y t−p ) + e t , where e t is an error vector with zero means.

If we suppose f is linear and µ is a mean vector such that µ = E(y t ) for t = 1, 2, · · · , y t can be reexpressed as

y t = µ + ρ 1 (y t−1 − µ) + · · · + ρ p (y t−p − µ) + e t , Then the predicted y t and residual vector r t are obtained as follows:

y b t = µ + ρ b 1 (y t−1 − µ) + · · · + b ρ b p (y t−p − µ) and r b t = y t − b y t ,

S t = max(S t−1 + a 0 r t − k, 0), (2.1) where a 0 t = √ m

Σ

m

Σ

m

and Σ r is the covariance matrix of r t .

The control scheme signals an out-of-control situation when the value of S t is greater than a predetermined value H.

3. Multioutput LS–SVR based residual control chart

3.1. Multioutput LS-SVR

Denote the training data set by (x, y) = {x i, y ij } n,m i,j=1 , with each input x i ∈ R d and the output y ij ∈ R. We consider the case of nonlinear regression of the form,

f j (x i ) = ω 0 j φ(x i ) + b j , (3.1) where the term b j is a bias term. Here the feature mapping function φ(·) : R d → R d

maps the input space to the higher dimensional feature space where the dimension d f is defined in an implicit way. Using a weighted quadratic loss function we define the optimization problem for estimation of (ω j , b j ) as follows:

min 1 2

m

X

j=1

||ω j || 2 + C 2

n

X

i=1 m

X

j,k=1

e ij V jk −1 e ik

over {ω j , b j , e ij } subject to equality constraints,

e ij = y ij − ω 0 j φ(x i ) − b j , i = 1, · · · , n, j = 1, · · · , m.

Here C > 0 is a penalty parameter, V is a covariance matrix of (e i1 , · · · , e im ) 0 and V jk −1 (j, k = 1, · · · , m) is the (jk)th element of the inverse of V , V −1 .

The Lagrangian function can be constructed from the optimization problem above,

L = 1 2

m

X

j=1

||ω j || 2 + C 2

n

X

i=1 m

X

j,k=1

e ij V jk −1 e ik −

n

X

i=1 n

X

j=1

α ij (e ij − y ij + ω 0 j φ(x i ) + b j ),

where α ij is Lagrange multiplier. The optimality conditions are given by

∂L

∂ω j

= 0 → ω j =

n

X

i=1

α ij φ(x i ), j = 1, · · · , m,

∂L

∂b j

= 0 →

n

X

i=1

α ij = 0, j = 1, · · · , m,

∂L

∂e ij

= 0 → α ij = C

Multioutput LS-SVR based residual MCUSUM control chart for autocorrelated process ^†

Changha Hwang ¹

y _t = f (y _t−1 , · · · , y _t−p ) + e _t , where e _t is an error vector with zero means.

If we suppose f is linear and µ is a mean vector such that µ = E(y _t ) for t = 1, 2, · · · , y _t can be reexpressed as

y _t = µ + ρ 1 (y _t−1 − µ) + · · · + ρ p (y _t−p − µ) + e t , Then the predicted y _t and residual vector r t are obtained as follows:

y b _t = µ + ρ b 1 (y _t−1 − µ) + · · · + b ρ b p (y _t−p − µ) and r b t = y t − b y _t ,

S t = max(S t−1 + a ⁰ r t − k, 0), (2.1) where a ⁰ _t = √ ^m

^Σ

and Σ _r is the covariance matrix of r _t .

Denote the training data set by (x, y) = {x i, y ij } ^n,m _i,j=1 , with each input x i ∈ R ^d and the output y ij ∈ R. We consider the case of nonlinear regression of the form,

f j (x i ) = ω ⁰ _j φ(x i ) + b j , (3.1) where the term b j is a bias term. Here the feature mapping function φ(·) : R ^d → R ^d

||ω j || ² + C 2

e _ij V _jk ⁻¹ e _ik

e ij = y ij − ω ⁰ _j φ(x i ) − b j , i = 1, · · · , n, j = 1, · · · , m.

Here C > 0 is a penalty parameter, V is a covariance matrix of (e i1 , · · · , e im ) ⁰ and V _jk ⁻¹ (j, k = 1, · · · , m) is the (jk)th element of the inverse of V , V ⁻¹ .

||ω j || ² + C 2

e ij V _jk ⁻¹ e ik −

α ij (e ij − y ij + ω ⁰ _j φ(x i ) + b j ),

where α _ij is Lagrange multiplier. The optimality conditions are given by

= 0 → α _ij = C

V _jk ⁻¹ e _ik , i = 1, · · · , n, j = 1, · · · , m,

Thus the optimal regression function for the given x _t is obtained as

which leads that ( α b _.j , b b _j ) can be obtained by LS-SVR (Suykens and Vanderwalle, 1999) using (x, y _.j ) for j = 1, · · · , m.

In fact V _jk is not known, ( α b _ij , b b _j ) can not be obtained as the solution to the linear equations (3.2) and (3.3). We use the iterative method with the initial V = I _m as follows:

V _jk = 1 n − 1

(y _ij − b f _j (x _i ))(y _ik − b f _k (x _i )), j, k = 1, · · · , m.

(y ij − b f _j ⁽⁻ⁱ⁾ (x i )) ² ,

(y ij − b f j (x i )) ²

(nm − trace(H)) ² , (3.5)

where H is a hat matrix such that ( b f 1 (x 1 ), b f 1 (x 2 ), · · · , b f m (x n )) ⁰ = H(y 11 , y 12 , · · · , y mn )).