Publication Cover
Mathematical and Computer Modelling of Dynamical Systems
Methods, Tools and Applications in Engineering and Related Sciences
Volume 26, 2020 - Issue 2
814
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

On the combination of kernel principal component analysis and neural networks for process indirect control

, &
Pages 144-168 | Received 04 Mar 2019, Accepted 28 Dec 2019, Published online: 07 Jan 2020

ABSTRACT

A new adaptive kernel principal component analysis (KPCA) for non-linear discrete system control is proposed. The proposed approach can be treated as a new proposition for data pre-processing techniques. Indeed, the input vector of neural network controller is pre-processed by the KPCA method. Then, the obtained reduced neural network controller is applied in the indirect adaptive control. The influence of the input data pre-processing on the accuracy of neural network controller results is discussed by using numerical examples of the cases of time-varying parameters of single-input single-output non-linear discrete system and multi-input multi-output system. It is concluded that, using the KPCA method, a significant reduction in the control error and the identification error is obtained. The lowest mean squared error and mean absolute error are shown that the KPCA neural network with the sigmoid kernel function is the best.

1. Introduction

We are involved in adaptive system control of the non-linear discrete system using neural network. In fact, the indirect adaptive control structure is based on two neural network blocks corresponding to the model identification of the dynamic behaviour of the system and system controller [Citation1Citation6].

However, the size of the neural network model or the neural network controller can accelerate or slow down their training phase. This problem of reduction of the higher dimension of neural network is well discussed by different techniques [Citation7Citation37].

The first step in reduction method is feature selection (new features are selected from the original inputs) or feature extraction (new features are transformed from the original inputs). In the modelling, all available indicators can be used, but correlated features or irrelevant features could deteriorate the generalization performance of any model [Citation7Citation18].

Many linear techniques of reduction dimensionality are proposed. For instance, Kohonen Self Organizing Feature Maps provide a way of representing multidimensional data in much lower dimensional spaces [Citation19], curvilinear component analysis[Citation20] and curvilinear distance analysis [Citation21] are proposed to make smaller the original dimension of the face images, data and for classification in medical imaging [Citation22] and principal component analysis (PCA) has been widely used for reducing high dimension in many applications [Citation16Citation18,Citation23Citation25].

PCA is a well-known method for feature extraction [Citation23,Citation24]. By calculating the eigenvectors of the covariance matrix of the original inputs, PCA linearly transforms the original high-dimensional input vector into new low-dimensional one whose components are uncorrelated. The basis function orders of PCA, as a typical approach, are the lowest in the sense of model dimension reduction [Citation16Citation18,Citation23Citation25].

In other applications, for instance, in the study by Zhang et al. [Citation15], a hybrid modelling strategy consists of a decoupled non-linear radial basis function neural network model based on PCA and linear autoregressive exogenous model. PCA reduces the cross-validation time required to identify optimal model hyper-parameters [Citation25]. In the study by Seerapu and Srinivas [Citation26], it was combined with the linear discriminate analysis to ameliorate the reduction. Then, in the study by Peleato et al. [Citation27], the use of fluorescence data coupled with neural networks based on PCA for improved predictability of drinking water disinfection by-products was investigated. Second, in the study by Qinshu et al. [Citation14], a PCA for feature selection and a grid searching and k-fold cross validation approach for parameter optimization in the support vector machine were developed. Finally, in other dimensionality reduction, linear techniques such as multidimensional scaling and probabilistic PCA are applied for user authentication using keystroke dynamics [Citation28] and other methods [Citation29].

However, PCA is a linear time/space separation method and cannot be directly applied to non-linear systems [Citation30]. Non-linear PCA has also been developed by using different algorithms. Kernel principal component analysis (KPCA) is a non-linear PCA developed by using the kernel method. Kernel method is originally used for Support Vector Machine (SVM). Later, it has been generalized into many algorithms having the term of dot products such as PCA. Specifically, KPCA firstly maps the original inputs into a high-dimensional feature space using the kernel method and then calculates PCA in the high-dimensional feature space. The linear PCA in the high-dimensional feature space corresponds to a non-linear PCA in the original input space. Recently, another linear transformation method called independent component analysis (ICA) is also developed. Instead of transforming uncorrelated components, ICA attempts to achieve statistically independent components in the transformed vectors. ICA is originally developed for blind source separation. Later, it has been generalized for feature extraction [Citation7].

KPCA is used as an effective method for tackling the problem of non-linear data [Citation31]. Indeed, in the study by Chakour et al. [Citation32], an algorithm for adaptive KPCA is proposed for dynamic process monitoring. This algorithm combined two existing algorithms: the recursive weighted PCA and the moving window KPCA algorithms. Even better, the fault detection of the non-linear system using KPCA method for extracting the reduced number of measurements from the training data [Citation33] is studied. In the study by Xiao and He [Citation34], a neural-network-based fault diagnosis approach of analog circuits is developed, using maximal class separability-based KPCA as a preprocessor to reduce the dimensionality of candidate features so as to obtain the optimal features with maximal class separability as inputs to the neural networks. In the study by Reddy and Ravi [Citation36], differential evolution (DE)-trained kernel principal component wavelet neural network and DE-trained kernel binary quantile regression are proposed for classification. In the proposed DE-KPCWNN technique, KPCA is applied to input data to get KPC, on which WNN is employed.

In the study by Klevecka and Lelis [Citation37], a functional algorithm of preprocessing of input data taking into account the specific aspects of teletraffic and properties of neural networks is created. The practical application for forecasting telecommunication data sequences shows that the procedure of data preprocessing decreases the time of learning and increases the plausibility and accuracy of the forecasts.

In this paper, the scheme of indirect adaptive control is used based on a neural network. First, the used neural network is based on an adaptive learning rate and a reduced derivative of the activation function. Even better, the weights of the neural network model and neural network controller are updated based on the identification error and the control error and used to generate the appropriate control.

In the first hand, in various studies [Citation1,Citation2,Citation5,Citation6,Citation15,Citation38,Citation39], the authors developed many algorithms for the adaptive indirect control without any preprocessing and they did not take into account the high dimension of the neural network.

On the other hand, in the study by Errachdi and Benrejeb [Citation4], the authors developed an algorithm to accelerate the speed of training phase in the adaptive indirect control based on neural network controller using a variable learning rate and a development of Taylor of the derivative of the activation function but they did not focus on the big dimension. That is why, in this paper, we propose a new algorithm of a reduction of the input vector of the neural controller in the control system based on the KPCA. The procedure of the data preprocessing scheme decreases the time of learning and increases the accuracy of the system control.

The present paper is organized as follows. After this introduction, Section 2 reviews the proposed KPCA method for system control. In fact, the proposed neural network controller based on the KPCA method is developed. Furthermore, in Section 3, the proposed algorithm is detailed. In Section 4, an example of a non-linear system is presented to illustrate the proposed efficiency of the method. Section 5 gives the conclusion of this paper.

2. The proposed KPCA neural network controller approach

On the basis of the input and output relations of a system, the above discrete non-linear system can be expressed by a NARMA (Non-linear Autoregressive Moving Average) model [Citation4,Citation35] given by (1) y(k+1)=f(y(k),...,y(kny),u(k),...,u(knu))(1)

f(.) is the non-linear function mapping specified by the model, y(k) and u(k) are the outputs and the inputs of the system, respectively, k is the discrete time, ny and nu are the number of past output and input samples, respectively, required for prediction.

The aim of this paper is to find a control law u(k) to the non-linear system, given by Equation (1), based on the KPCA approach in order that the system output y(k) tracks, where possible, the desired value r(k).

The indirect control architecture is shown in , and the weights of the neural network model and the neural network controller are trained by different errors where e(k) is the identification error, eˆc(k) is the estimated tracking error and ec(k) is the tracking error [Citation4].

Figure 1. The architecture of indirect neural control.

Figure 1. The architecture of indirect neural control.

The architecture shown in assumes the role of two neural blocks. Indeed, the weights of the neural model are adjusted by the identification error e(k); however, the weights of the neural controller are trained by the tracking error ec(k) [Citation4].

The multi-layer perceptron is used in the neural model and in the neural controller. Each block consists of three layers. The sigmoid activation function s(.) is used for all neurons [Citation4].

2.1. The neural network model

The principle of neural network model is given by the .

Figure 2. The principle of neural network model.

Figure 2. The principle of neural network model.

The jth output layer of the hidden layer is described as follows: (2) hj=i=1n1wjixij=1,2,...,n2(2)

where n1 is the number of nodes of the input layer, n2 is the number of nodes of the hidden layer and wji is the hidden weight.

The input vector of the neural network model is (3) x=u(k),u(k1),u(k2),...T(3)

where u(k) is the neural network controller output.

The output of the neural network model is given by the following equation: (4) yr(k+1)=λs(j=1n2w1js(hj))(4)

where λ is a scaling coefficient and w1j is the output weight.

The compact form of the output is given by the following equation: (5) yr(k+1)=λs(h1)=λs[w1TS(Wx)](5)

with,.x=[xi]T,i=1,,n1W = [wji], i= 1,,n1,j=1,,n2S(Wx)=[s(hj)]T,j=1,,n2w1=[w1j]T,j=1,,n2

The identification error e(k) is given by (6) e(k)=y(k)yr(k)(6)

The function cost is given by the following equation: (7) E=12(e(k))2(7)

where N is the number of observations.

The output weights are updated by the following equation: (8) w1j(k+1)=w1j(k)+Δw1j(k)(8)

where Δw1j, j=1,...,n2 is given by minimizing the cost function defined as follows: (9) Δw1j=η(k)E(k)w1j=η(k)E(k)e(k)e(k)h1h1w1j=λη(k)e(k)s(h1)S(Wx)(9)

η(k) is the variable learning rate for the weights of the neural network model, 0η(k)1, given by (10) η(k)=1λ2s2(h1)ST(Wx)S(Wx)+w1jTS(Wx)S(Wx)w1jxTx(10)

s(h1) is the derivative of s(h1) defined as follows: (11) s(h1)=s(h1)(1s(h1))=eh1(1+eh1)214+12h1+O(h13)(11)

The hidden weights are updated by the following equation: (12) wji(k+1)=wji(k)+Δwji(k)(12)

where Δwji is given by the following equation: (13) Δwji=η(k)E(k)wji=η(k)E(k)e(k)e(k)h1h1hjhjwji=λη(k)s(h1)S(Wx)w1jxTe(k)(13)

with S(Wx)=diag[s(hj)]T,j=1,...,n2

For the stability of the neural network model, the Lyapunov function is detailed. Indeed, let us define a discrete Lyapunov function as (14) V(k)=E(k)=12(e(k))2(14)

where e(k) is the identification error given by Equation (6). The change in the Lyapunov function is obtained by (15) ΔV(k)=V(k+1)V(k)=12((e(k+1))2(e(k))2)(15)

The identification error difference can be represented by (16) Δe(k)=e(k+1)e(k)η(k)yr(k)wi(k)e(k)(16)

where wi(k) is the synaptic weights of the neural network identifier (w1j(k), w ji(k)). Using Equation (16), the identification error is going to be (17) e(k+1)=e(k)η(k)ξ(k)e(k)(17)

with (18) ξ(k)=(λ2)2s2(h1)ST(Wx)S(Wx)+w1jTS(Wx)S(Wx)w1jxTx(18)

From Equations (17) and (18), the convergence of the identification error e(k) is guaranteed if limk+e(k)=0 or 0<η(k)<2ξ1(k) with V(k)>0 from Equation (14).

The suitable online algorithm may be applied if the variable learning rate η(k) is ξ1(k).

2.2. The KPCA neural network controller

The PCA technique is a lower-dimensional projection method that can use with multivariate data mining [Citation25,Citation30Citation32,Citation40]. The main idea behind the PCA is to represent multidimensional data with fewer numbers of variables retaining the main features of the data. It is inevitable that by reducing dimensionality, some features of the data will be lost. The method PCA tries to project multidimensional data into a lower-dimensional space, retaining as much as possible variability of the data [Citation4,Citation25,Citation30Citation32,Citation40].

However, the presented PCA method is a linear technique and cannot capture the non-linear structure in a data set. For this reason, non-linear generalization has been proposed using the kernel method, introduced for computing the principal components of the data set mapped non-linearly into some high-dimensional feature space. Because sample data are implicitly mapped from an input space to a higher-dimensional feature space ζ, KPCA is implemented efficiently by virtue of kernel tricks and it can be solved as an eigenvalue problem of its kernel matrix.

In this section, we propose to reduce the input vector of the neural network controller of the adaptive indirect control structure. Indeed, before the reduction of the input vector, the new architecture of the adaptive indirect KPCA neural network control is given .

Figure 3. The new architecture of indirect neural control.

Figure 3. The new architecture of indirect neural control.

We recall the input vector of the neural network controller is (19) z=r(k),r(k1),r(k2),...T(19)

where r(k) is the desired value.

For the input data {zk}k=1l, ϕ represents the non-linear mapped data in ζ. The covariance matrix of the projected features C is l×l, defined as (20) C=1lj=1lϕ(zj)ϕ(zj)T(20)

Its eigenvalues and eigenvectors are given by (21) Cpk=λkpkk=1,...,l(21)

From Equation (20), Equation (21) may be (22) 1lj=1lϕ(zj)(ϕ(zj)Tpk)=λkpk(22)

pk can be rewritten as (23) pk=j=1lαjϕ(zj)(23)

with αj, j=1,...,l as the expansion coefficients. Equation (21) can be rewritten as (24) 1lj=1lϕ(zj)(ϕ(zj)Ti=1lαiϕ(zi))=λki=1lαiϕ(zi)(24)

The kernel function kr(zi,zj) is defined as (25) kr(zi,zj)=ϕ(zi)Tϕ(zj)(25)

is multiplied to the left and to the right by ϕ(zd)T, Equation (23) becomes (26) 1lj=1lϕ(zd)Tϕ(zj)(ϕ(zj)Ti=1lαiϕ(zi))=λki=1lαiϕ(zd)Tϕ(zi)(26)

Equation (25) is (27) 1li=1lkr(zd,zi)j=1lαjkr(zi,zj)=λki=1lαikr(zd,zi)(27)

with kr(zd,zi)=ϕ(zd)Tϕ(zi)

The resulting kernel principal components can be calculated using (28) xr(k)=ϕ(z)Tpk=i=1lαikr(z,zi)(28)

The reduced space of the signal given by Equation (28) constitutes the input vector of the neural network controller.

We propose a dimensionality reduction technique that should be employed to reduce the dimensionality of the feature vectors before they are fed as input (29) x1=xr(k),xr(k1),xr(k2),...T(29)

The primary purpose of data pre-processing is to modify the input variables so they can better match the predicted output. The main purpose of neural network data transformation is to modify the distribution of the network input parameters without losing much information.

Using the reduced input vector x1, the jth output layer of the hidden layer is described as follows: (30) hcj=i=1n3vjix1ij=1,...,n4(30)

where n3 is the number of nodes of the input layer, vji is the hidden weight.

Similarly, the output of the neural controller is given by the following equation: (31) u(k)=λcsj=1n4v1jshcj=λcsj=1n4v1jsi=1n3vjix1i(31)

where n4 is the number of nodes of the hidden layer, λc is a scaling coefficient and v1j is the output weight.

The compact form of the control input to the system is given by the following equation: (32) u(k)=λcs(hc1)=λcs[v1TS(Vx1)](32)

with,.x1=[x1i]T,i=1,,n3V = [vji],i = 1,,n3,j=1,,n4S(Vx1)=[s(hj)]T,j=1,,n4v1=[v1j]T,j=1,,n4

The tracking error ec(k) is given by the following equation: (33) ec(k)=y(k)r(k)(33)

where r(k) is the desired output.

The updated weights of the neural controller are obtained by minimizing the cost function defined as follows: (34) Ec=12(ec(k))2(34)

where N is the number of observations. The output weights are updated by (35) v1j(k+1)=v1j(k)+Δv1j(k)(35)

with Δv1j, j=1..n4, is the incremental change of the output weights: (36) Δv1j=ηc(k)Ec(k)v1j=ηc(k)Ecec(k)ec(k)y(k)yr(k)h1h1s(hj)s(hj)hjhju(k)u(k)hc1hc1v1j=ηc(k)λcec(k)s(h1)w1jS(Wx)wjis(hc1)S(Vx1)(36)

where ηc(k) is the learning rate for the weights of the neural network controller, 0ηc(k)1, given by (37) ηc(k)=1/(λc2s2(hc1)s(h1)w1jwjiS(Wx)×ST(Vx1)S(Vx1)+v1jTS(Vx1)S(Vx1)v1jx1x1T)(37)

Concerning the hidden weights, they are updated by (38) vji(k+1)=vji(k)+Δvji(k)(38)

where Δvji is given by (39) Δvji=ηc(k)Ec(k)vji=ηc(k)Ecececyyrh1h1s(hj)s(hj)hjhjuuhc1hc1hcjhcjvji=ηc(k)λcec(k)s(h1)w1jS(Wx)wjis(hc1)v1jS(Vx1)x1T(39)

with S(Vx1)=diag[s(hj)]T,j=1,...,n4

Let Ψ=[ϕ(z1),...,ϕ(zl)], 1l=(1l)l×l and Γ˜=ΨTΨ, Γ is the matrix which is defined as (40) Γ=Γ˜1lΓ˜Γ˜1l+1lΓ˜1l(40)

with Γ˜ij=ϕ(zi)T1lϕ(zj)=kr(zi,zj). In this paper, different kernel functions are used and defined in .

Table 1. The usual kernel functions.

The principal components are the s first vectors associated with the highest eigenvalues and are often sufficient to describe the structure of the data. The number s satisfies the Inertia Percentage Criterion (IPC) [Citation25] given by (41) s=arg(IPC99)(41)

with (42) IPC=100i=1sλii=1lλii(42)

We have developed a neural network controller based on a reduced input vector and a variable learning rate. Consequently, this approach increases the training speed.

For the stability of the neural network controller, the Lyapunov function is detailed. Indeed, let us define a discrete Lyapunov function as (43) Vc(k)=Ec(k)=12(ec(k))2(43)

where ec(k) is the control error. The change in the Lyapunov function is obtained by (44) ΔVc(k)=Vc(k+1)Vc(k)=12((ec(k+1))2(ec(k))2)(44)

The control error difference can be represented by (45) Δec(k)=ec(k+1)ec(k)ηc(k)(ec(k)vc(k))Ty(k)uc(k)uc(k)vc(k)ec(k)(45)

where vc(k) is the synaptic weights of the neural network controller (v1j(k) and vji(k)). Using Equation (45), the control error is going to be (46) ec(k+1)=ec(k)ηc(k)ξc(k)ec(k)(46)

with (47) ξc(k)=λc2s2(hc1)s(h1)w1jwjiS(Wx)ST(Vx1)S(Vx1)+v1jTS(Vx1)S(Vx1)v1jx1x1T(47)

From Equations (46) and (47), the convergence of the control error ec(k) is guaranteed if limk+ec(k)=0 or 0<ηc(k)<2ξc1(k) with Vc(k)>0 from Equation (43).

The suitable online algorithm for real-time applications may be applied if the variable learning rate ηc(k) is ξc1(k).

3. The proposed algorithm

In this section, a summary of the proposed algorithm of the online kernel principal component analysis neural network controller is presented.

Offline phase

  1. Initialization of neural network parameters (v1j, vji, w1j, wji) using M observations, (MN),

  2. Determine the matrix C, focus the data and decompose into eigenvalue λ,

  3. Determine the orthogonal eigenvalues and the eigenvectors of the covariance matrix,

  4. Order the eigenvectors on the decreasing way respect to the corresponding eigenvalues,

  5. (5) Choose xr(k) that satisfy Equation (28) using the s retained principal components given byEquations (41) and (42).

Online phase

  1. At time instant (k+1), we have a new data (u(k+1),y(k+1)), using the obtained input vector x1, if the condition e(k+1)<ε1, where ε1>0 is a given small constant, is satisfied then the neural network model, given by Equation (5), approaches sufficiently the behaviour of the system.

  2. If the condition ec(k+1)<ε2, where ε2>0 is a given small constant, is satisfied, then the reduced neural network controller provides sufficiently the control law u(k).

  3. If e(k+1)<ε1 is not satisfied, the update of the synaptic weights of the neural network model is necessary, using Equations (8) and (12),

  4. If ec(k+1)<ε2 is not satisfied, the update of the synaptic weights of the neural network controller is necessary, using Equation (35) and (38),

  5. (5), End.

4. Simulation results

In this section, two non-linear discrete systems are used. Indeed, the first is a single-input single-output nonlinear time-varying system and the second is a multi-input multi-output (MIMO) system.

4.1. Example of time-varying system

The time-varying non-linear system is described by the input–output model in the following equation [Citation41]. (48) y(k+1)=y(k)y(k1)y(k2)u(k1)(y(k2)1)+u(k)a0(k)+a1(k)y2(k1)+a2(k)y2(k2)(48)

where y(k) and u(k) are, respectively, the output and the input of the time-varying non-linear system at instant k; a0(k), a1(k) and a2(k) are given by (49) a0(k)=1a1(k)=1+0.2cos(k)a2(k)=1+0.2sin(k)(49)

The trajectory of a1(k) and a2(k) are given in .

Figure 4. a1(k) and a2(k) trajectories.

Figure 4. a1(k) and a2(k) trajectories.

In this section, in order to examine the effectiveness of the proposed algorithm of the dimensionality reduction, different performance criteria are used.

Indeed, the mean squared identification error (MSEe) and the mean absolute identification error (MAEe) are, respectively, given by (50) MSEe=1Nk=1N(y(k)yr(k))2(50) (51) MAEe=1Nk=1N(y(k)yr(k))(51)

where y(k) is the time-varying system output, yr(k) is the neural network model output and the used number of observations N is 100.

The mean squared tracking error (MSEec) and the mean absolute tracking error (MAEec) are, respectively, given by (52) MSEec=1Nk=1N(y(k)r(k))2(52)

where r(k) is the desired value. (53) MAEec=1Nk=1N(y(k)r(k))(53)

In this section, we examine the effectiveness of the proposed algorithm of the dimensionality reduction of the neural network controller input vector in the adaptive indirect control system.

Indeed, in offline phase, using a reduced number of observations (M=3) to find, either, the parameter initialization of the neural network parameters (w1j, wji, v1j, vji), and the KPCA parameters as the matrix C, the eigenvalues, the eigenvectors, and finally the reduced input vector xr(k) given by Equation (28) based on the s retained principal components given by the Equations (41)–(42) are obtained.

In online phase, at instant (k+1), we use the input vector of the neural network controller x1=xr(k),xr(k1),xr(k2),xr(k3),xr(k4)T.

In this case, both neural network model and pre-processing neural network controller consist of single input, 1 hidden layer with 8 nodes, and a single output node, identically, and a variable learning rate of neural network model η(k) and of neural network controller ηc(k). The used scaling coefficient is λ=λc=1 and ε1=ε2=102.

To use the suitable kernel function, the simulation results present that the used sigmoid function as a kernel, compared to other kernel functions defined in , which gives the lowest value obtained with the calculation of the MSEe indicating which sigmoid kernel function is the most reliable.

Table 2. The comparison results of the used kernel function in the identification error.

However, the features are directly fed to multilayer perceptron neural network as inputs without any preprocessing by KPCA. The obtained online MLP neural network model and the plant output are obtained. The used input vector of the MLP neural network is [r(k),r(k1),r(k2),r(k3),r(k4),r(k5)]T when the number of the hidden layer is 1 with 23 nodes and the value of the learning rate is variable. From , an excellent concordance between both plant output and the desired value is observed with a mean square error equal to 6.9269107.

In , the output of the reduced online MLP neural network controller and the desired values are presented. In this case, the KPCA method is combined with the multilayer perceptron neural network. The KPCA technique is used as a preprocessing method to reduce the dimension features. The obtained reduced vector is fed also to the online multilayer perceptron neural network. The number of the hidden layer is 1. The learning rates are variable. A concordance between both desired values and the plant output is noticed from . To give more efficiency of this combination, several functions are tested and the result is presented in .

As defined in , we use the sigmoid function as a kernel function in the KPCA technique, and the tracking control aim of this system is to follow as possible the reference signal based on a proposed pre-processing neural network controller.

In this simulation, the desired value, r(k), is given in the following: (54) r(k)=0.45fork250.20for26k500.45for51k750.20fork>75(54)

We examine the influence of the dimensionality reduction of the neural network controller input vector in the identification error in and in the control error in .

Table 3. The influence of the dimensionality reduction in the identification error.

Table 4. The influence of the dimensionality reduction in the control error.

From and we observe that using the KPCA as a pre-processing phase to reduce the input vector of the neural network controller, the neural network KPCA controller has the smallest performance criteria in the identification error e(k) and in the control error ec(k). These results are shown in , and .

Indeed, presents the pre-processing control system output and the desired values. In this case, the KPCA method is combined with a multilayer perceptron neural network controller.

The KPCA technique is used as a preprocessing method to reduce the dimension features. The obtained reduced vector is fed to the neural network controller. A concordance between the desired values and the control system output is noticed from although the parameters vary over time. However, and present, respectively, the control law and the control error.

These figures reveal that the NN controller using the KPCA as a pre-processing technique has smaller errors than the other controller without pre-processing.

Figure 5. The pre-processing control system output and the desired values.

Figure 5. The pre-processing control system output and the desired values.

Figure 6. The control law.

Figure 6. The control law.

Figure 7. The control error.

Figure 7. The control error.

Another desired value r(k), given by Equation (55), is used to examine the effectiveness of the proposed algorithm of the dimensionality reduction of the neural network controller input vector in the adaptive indirect control system for the time-varying non-linear system.

Indeed, both neural network model and neural network controller consist of single input, 1 hidden layer with 23 nodes, and a single output node, identically. The used scaling coefficient is λ=λc=1 and ε1=ε2=102.

In this simulation, the desired value, r(k), is given in the following: (55) r(k)=0.45fork250.20for26k300.40for31k350.30for36k800.20fork>80(55)

presents the pre-processing control system output and the desired values. In this case, the KPCA method is combined with a multilayer perceptron neural network controller. A concordance between the desired values and the control system output is noticed, although the time-varying parameters.

However, and present, respectively, the control law and the control error. These figures reveal that the NN controller using the KPCA as a pre-processing technique has smaller errors than the other controller without pre-processing.

Figure 8. The pre-processing control system output and the desired values.

Figure 8. The pre-processing control system output and the desired values.

Figure 9. The control law.

Figure 9. The control law.

Figure 10. The control error.

Figure 10. The control error.

and present the influence of the dimensionality reduction in the identification error and in the control error.

Table 5. The influence of the dimensionality reduction in the identification error.

Table 6. The influence of the dimensionality reduction in the control error.

From and we observe that by using the KPCA as a pre-processing phase to reduce the input vector of the neural network controller, the neural network KPCA controller has the smallest performance criteria in the identification error e(k) and in the control error ec(k). These results are shown in , and .

4.2. Effect of disturbances

An added noise v(k) is injected to the output of the time-varying non-linear system, given by Equation (48), in order to test the effectiveness of the pre-processing neural network controller.

To measure the correspondence between the system output and the desired value, a signal noise ratio (SNR) is taken from the following equation: (56) SNR=k=0N(y(k)yˉ)k=0N(v(k)vˉ)(56)

with v(k) is a noise of the measurement of symmetric terminal δ, v(k)[δ,δ], yˉ and vˉ are an output average value and a noise average value, respectively. In this paper, the taken SNR is 5%.

Using the first desired value r(k), the sensitivity of the proposed pre-processing neural network controller is examined in and respectively.

Table 7. The influence of the dimensionality reduction in the identification error.

Table 8. The influence of the dimensionality reduction in the control error.

From these tables, we observe that by using the KPCA as a pre-processing phase to reduce the input vector of the neural network controller, the neural network KPCA controller has the smallest performance criteria in the identification error and in the control error.

Using the second desired value, the sensitivity of the proposed pre-processing neural network controller is examined in and respectively.

Table 9. The influence of the dimensionality reduction in the identification error.

Table 10. The influence of the dimensionality reduction in the control error.

According to the obtained simulation results, despite the fact of the presence of disturbance in the system output and the time-varying parameters, the lowest MSEec, MAEec and max(ec) are obtained using a combination between the neural network controller and the KPCA technique.

4.3. Example of multi-input multi-output system

In this section, in order to examine the effectiveness of the proposed algorithm of the dimensionality reduction, a multi-input multi-output (MIMO non-linear system, given by the following equation, is used. (57) y1(k+1)=y1(k)1+y22(k)+u1(k)y2(k+1)=y1(k)y2(k)1+y22(k)+u2(k)(57)

where yi(k) and ui(k), i=1,2, are, respectively, the output and the input of the MIMO non-linear system at instant k; r1(k) and r2(k) are the reference signal given by (58) r1(k)=sin(2kπ100)r2(k)=0.8fork500.4for51k1000.8for101k1500.4for151k200(58)

The control system outputs, the desired values and the control errors are presented in . However, presents the control law u1 and u2 trajectories. These figures reveal that using a NN controller combined with the KPCA as a pre-processing technique gives an excellent concordance between the system outputs and the desired outputs with smaller control errors.

Figure 11. The control system output, the desired values and the control error.

Figure 11. The control system output, the desired values and the control error.

Figure 12. The control law u1 and u2 trajectories.

Figure 12. The control law u1 and u2 trajectories.

In this case, both neural network model and pre-processing neural network controller consist of single input, 1 hidden layer with 28 nodes, and two output nodes, identically, and variable learning rates of neural network model, ηi(k), and of neural network controller ηic(k). The used scaling coefficient is λi=λci=1 and εi=102, i=1:2.

The input vector of the neural network controller is x1=xr1(k),xr1(k1),xr1(k2),xr2(k),xr2(k1),xr2(k2)T. The influence of the dimensionality reduction in the model error and in the control error is shown in and .

Table 11. The influence of the dimensionality reduction in the model error.

Table 12. The influence of the dimensionality reduction in the control error.

5. Conclusion

In this paper, an online combination between the neural network controller and the KPCA method is proposed and is applied with success in indirect adaptive control. Different kernel functions are tested. For instance, the lowest MSEe, MAEe, max(e), MSEec, MAEec and max(ec) are obtained, and it is proved that the sigmoid kernel function is the best. The effectiveness of the proposed algorithm is successfully applied, firstly, to single-input single-output system, with and without disturbances, and it proved its robustness to reject disturbances and to accelerate the speed of the learning phase of the neural model and neural controller. Second, it is applied to MIMO system and it gives good results.

Disclosure statement

No potential conflict of interest was reported by the authors.

References

  • O. Mohareri, R. Dhaouadi, and A.B. Rad, Indirect adaptive tracking control of a nonholonomic mobile robot via neural networks, Neurocomputing 88 (2012), pp. 54–66. doi:10.1016/j.neucom.2011.06.035.
  • A.A. Bohari, W.M. Utomo, Z.A. Haron, N.M. Zin, S.Y. Sim, and R.M. Ariff, Speed tracking of indirect field oriented control induction motor using neural network, Procedia Technol. 11 (2013), pp. 141–146. doi:10.1016/j.protcy.2013.12.173.
  • S. Slama, A. Errachdi, and M. Benrejeb, Adaptive PID controller based on neural networks for MIMO nonlinear systems, J. Theor. Appl. Inf. Technol. 97 2 (2019), pp. 361–371.
  • A. Errachdi and M. Benrejeb, Performance comparison of neural network training approaches in indirect adaptive control, Int. J. Control. Autom. Syst. 16 (3) (2018), pp. 1448–1458. doi:10.1007/s12555-017-0085-3.
  • N. Ben, W. Ding, D.A. Naif, and E.A. Fuad, Adaptive neural state-feedback tracking control of stochastic nonlinear switched systems: An average dwell-time method, IEEE Trans. Neural Networks Learn. Syst. 30 (4) (2018), pp. 1076–1087. doi:10.1109/TNNLS.2018.2860944.
  • N. Ben, L. Yanjun, Z. Wanlu, L. Haitao, D. Peiyong, and L. Junqing, Multiple lyapunov functions for adaptive neural tracking control of switched nonlinear non-lower-triangular systems, IEEE Trans. Cybern. 99 (2019). doi:10.1109/TCYB.2019.2906372
  • P.O. Hoyer and A. HyvUarinen, Independent component analysis applied to feature extraction from colour and stereo images, Network 11 (3) (2000), pp. 191–210. doi:10.1088/0954-898X_11_3_302.
  • L.J. Cao, K.S. Chua, W.K. Chong, H.P. Lee, and Q.M. Gu, A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine, Neurocomputing 55 (1–2) (2003), pp. 321–336. doi:10.1016/S0925-2312(03)00433-8.
  • I. Guyon and A. Eliseeff, An introduction to variable and feature selection, J. Mach. Learn. Res. 3 (2003), pp. 1157–1182.
  • J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, and V.N. Vapnik, Feature selection for SVMs, Adv. Neural Inform. Process. Syst. 13 (2001), pp. 668–674.
  • F.E.H. Tay and L.J. Cao, Saliency analysis of support vector machines for feature selection, Neural Network World 2 1 (2001), pp. 153–166.
  • F.E.H. Tay and L.J. Cao, A comparative study of saliency analysis and genetic algorithm for feature selection in support vector machines, Intell. Data Anal. 5 (3) (2001), pp. 191–209. doi:10.3233/IDA-2001-5302.
  • K. Lee and V. Estivill-Castro, Feature extraction and gating techniques for ultrasonic shaft signal classification, Appl. Soft Comput. 7 (2007), pp. 156–165. doi:10.1016/j.asoc.2005.05.003.
  • H. Qinshu, L. Xinen, and X. Shifu, Comparison of PCA and model optimization algorithms for system identification using limited data, J. Appl. Sci. 13, 11 (2013), pp. 2082–2086. doi:10.3923/jas.2013.2082.2086
  • R. Zhang, J. Tao, R. Lu, and Q. Jin, Decoupled ARX and RBF neural network modeling using PCA and GA optimization for nonlinear distributed parameter systems, IEEE Trans. Neural Networks Learn. Syst. 29 (2) (2018), pp. 457–469. doi:10.1109/TNNLS.2016.2631481.
  • M.L. Wang, X.D. Yan, and H.B. Shi, Spatiotemporal prediction for nonlinear parabolic distributed parameter system using an artificial neural network trained by group search optimization, Neurocomputing 113 (2013), pp. 234–240. doi:10.1016/j.neucom.2013.01.037.
  • S. Yin, S.X. Ding, A.H. Abandan Sari, and H.Y. Hao, Data-driven monitoring for stochastic systems and its application on batch process, Int. J. Syst. Sci. 44 (7) (2013), pp. 1366–1376. doi:10.1080/00207721.2012.659708.
  • E. Aggelogiannaki and H. Sarimveis, Nonlinear model predictive control for distributed parameter systems using data driven artificial neural network models, Comput. Chem. Eng. 32 (6) (2008), pp. 1225–1237. doi:10.1016/j.compchemeng.2007.05.002.
  • M. Madhusmita and H.S. Behera, Kohonen self organizing map with modified K-means clustering For high dimensional data set, Int. J. Appl. Inf. Syst. (IJAIS). 2(3) (2012), pp. 34–39. ( Foundation of Computer Science FCS, New York, USA).
  • S. Buchala, N. Davey, T.M. Gale, and R.J. Frank, Analysis of linear and nonlinear dimensionality reduction methods for gender classifcation of face images, Int. J. Syst. Sci. 14 (36) (2005), pp. 931–942. doi:10.1080/00207720500381573.
  • M. Lennon, G. Mercier, M.C. Mouchot, and L. Hubert-Moy, Curvilinear component analysis for nonlinear dimensionality reduction of hyperspectral images, Proc. SPIE Image Signal Process Remote Sens. VII 4541 (2001), pp. 157–168.
  • N.K. Batmanghelich, B. Taskar, and C. Davatzikos, Generative-discriminative basis learning for medical imaging, IEEE Trans. Med. Imaging 31 (2012), pp. 51–69. doi:10.1109/TMI.2011.2162961.
  • L. Van der Mateen, E. Postma, and J. Van den Herik, Dimensionality reduction: A comparative review tilburg centre for creative computing, Tilburg University, LE Tilburg, The Netherlands, 2009.
  • K. Kuzniar and M. Zajac, Data pre-processing in the neural network identification of the modified walls natural frequencies, Proceedings of the 19th International Conference on Computer Methods in Mechanics CMM-2011, Warszawa, 9–12 May, 2011, pp. 295–296.
  • V.M. Janakiraman, X. Nguyen, and D. Assanis, Nonlinear identification of a gasoline HCCI engine using neural networks coupled with principal component analysis, Appl. Soft Comput. 13 (2013), pp. 2375–2389. doi:10.1016/j.asoc.2013.01.006.
  • K. Seerapu and R. Srinivas, Face recognition using robust PCA and radial basis function network, Int. J. Comput. Sci. Commun. Networks 2 5 (2012), pp. 584–589.
  • N.M. Peleato, R.L. Legge, and R.C. Andrews, Neural networks for dimensionality reduction of fluorescence spectra and prediction of drinking water disinfection by-products, Water Res. 136 (2018), pp. 84–94. doi:10.1016/j.watres.2018.02.052
  • C. Sucheta and K.V. Prema, Effect of dimensionality reduction on performance in artificial neural network for user authentication, 3rd IEEE International Advance Computing Conference (IACC), Ghaziabad, India, 2013.
  • G.E. Hinton and R.R. Salakhutdinov, Reducing the dimensionality of data with neural networks, Science 313 (2006), pp. 504–507. doi:10.1126/science.1127647.
  • Q. Zhu and C. Li, Dimensionality reduction with input training neural network and its application in chemical process modelling, Chinese J. Chern. Eng. 14 (5) (2006), pp. 597–603. doi:10.1016/S1004-9541(06)60121-3.
  • C.-Y. Cheng, -C.-C. Hsu, and M.-C. Chen, Adaptive kernel principal component analysis (KPCA) for monitoring small, Ind. Eng. Chem. Res. 49 (2010), pp. 2254–2262. doi:10.1021/ie900521b.
  • C. Chakour, M.F. Harkat, and M. Djeghaba, New adaptive kernel principal component analysis for nonlinear dynamic process monitoring, Appl. Math. Inf. Sci. 9 4 (2015), pp. 1833–1845.
  • R. Fezai, M. Mansouri, O. Taouali, M.F. Harkat, and N. Bouguila, Online reduced kernel principal component analysis for process monitoring, Jx 61 (2018), pp. 1–11. doi:10.1016/j.jprocont.2017.10.010.
  • Y. Xiao and Y. He, A novel approach for analog fault diagnosis based on neural networks and improved kernel PCA, Neurocomputing 74 (2011), pp. 1102–1115. doi:10.1016/j.neucom.2010.12.003.
  • A. Errachdi and M. Benrejeb, On-line identification using radial basis function neural network coupled with KPCA, Int. J. Gen. Syst. 45 7 (2016), pp. 1–15.
  • K.N. Reddy and V. Ravi, Differential evolution trained kernel principal component WNN and kernel binary quantile regression: Application to banking, Knowledge-Based Syst. 39 (2013), pp. 45–56. doi:10.1016/j.knosys.2012.10.003.
  • I. Klevecka and J. Lelis, Pre-processing of input data of neural networks: The case of forecasting telecommunication network traffic, Telektronikk 104 3/4 (2008), pp. 168–178.
  • M. Shirzadeh, A. Amirkhani, A. Jalali, and M.R. Mosavi, An indirect adaptive neural control of a visual-based quadrotor robot for pursuing a moving target, ISA Trans 59 (2015), pp. 290–302. doi:10.1016/j.isatra.2015.10.011.
  • S.J. Yoo, J.B. Park, and Y.H. Choi, Indirect adaptive control of nonlinear dynamic systems using self recurrent wavelet neural networks via adaptive learning rates, Inf. Sci. 177 (2007), pp. 3074–3098. doi:10.1016/j.ins.2007.02.009.
  • B. Scholkopf and A. Smola, Learning with Kernels, MIT Press, Cambridge, 2002.
  • K.S. Narendra and K. Parthasarthy, Identification and control of dynamical systems using neural networks, IEEE Trans. Neural Networks 1 (1) (1990), pp. 4–27. doi:10.1109/72.80202.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.