Full article: Identification of a nonlinear rational model based on bias compensated multi-innovation stochastic gradient algorithm

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

The nonlinear rational model is a generalized nonlinear model and has been gradually applied in modelling many dynamic processes. The parameter identification of a class of nonlinear rational models is studied in this paper. This identification problem is very challenging because of the complexity of the rational model and the coupling between model inputs and outputs. To identify the nonlinear model, a bias compensated multi-innovation stochastic gradient algorithm is presented. The multi-innovation technique replacing the scalar innovation with an information vector is adopted to accelerate the traditional stochastic gradient algorithm. However, the estimate obtained by the accelerated algorithm is biased because of the correlation between the information vector and the noise. To overcome this difficulty, a bias compensation strategy is used. The bias is calculated and compensated to get an unbiased estimate. Theoretical analysis shows that the proposed algorithm can give biased estimates with linear complexity. The proposed algorithm is validated by a numerical experiment and the modelling of the propylene catalytic oxidation.

KEYWORDS:

1. Introduction

To describe the dynamic characteristics of the nonlinear systems, many nonlinear structures have been developed, such as the NARMAX model, Volterra series model, block-oriented nonlinear model and so on [Citation1–4]. In recent years, a model named nonlinear rational model (NRM) has been gradually applied in the modelling and control of nonlinear systems, particularly in some chemical processes and mechanistic systems [Citation5–8]. The NRM is a kind of generalized nonlinear model. Traditional rational model, NARMAX model, integral model, output affine model and linear difference equation model can be seen as its special form [Citation9].

The NRM is defined as the ratio of two polynomial expansions of past inputs, outputs and prediction errors [Citation10]. The identification of the NRM is quite difficult because the NRM cannot be parameterized into a linear-in-parameter system [Citation9–11] and the coupling between model input and output. Despite the difficulties, researchers have reported some results [Citation7,Citation9,Citation12]. For example, to identify the parameters of the NRM, a prediction error algorithm and a new rational model estimation (RME) algorithm were proposed [Citation11,Citation13]. To decrease the computational cost of the above two algorithms, a recursive RME algorithm, an error back propagation algorithm, an implicit least-squares iterative algorithm, two maximum likelihood algorithms and a globally convergent algorithm were derived [Citation5,Citation7,Citation10,Citation14,Citation15]. To determine the NRM’s structure, an orthogonal RME algorithm and a genetic algorithm were investigated [Citation9,Citation11]. Zhu et al. summarized the advances in NRM identification and control [Citation8].

Although these algorithms work well for many NRMs, they have at least O(n²) complexity, which makes them unsuitable for online applications. To decrease the complexity, the stochastic gradient (SG) algorithm is an alternative because it costs only O(n) flops each iteration [Citation16]. There are many gradient-based algorithms, among which, a key term separation gradient iterative algorithm was derived to identify a fractional-order nonlinear system [Citation17], a three-stage forgetting factor SG method was proposed for a Hammerstein system [Citation18] and an auxiliary model stochastic gradient method was studied for a Wiener–Hammerstein system [Citation19].

However, the estimate for the NRM given by the traditional SG algorithm is biased because the information vector is correlated to the noise [Citation10,Citation20,Citation21]. To get unbiased estimates, the bias compensation (BC) technique and instrumental variable (IV) technique are often used, for example, a BC-based method was proposed to estimate the state of charge for lithium-ion batteries [Citation22], a BC-based sign algorithm was addressed to estimate the weight vector of an unknown system [Citation23], an IV method was implemented for detecting and correcting parameter bias within structural equation models [Citation24], a unified model-implied IV approach was reported for structural equation modelling with mixed variables [Citation25]. Sometimes, for the IV method, although there are some guiding principles, it is still very difficult to select an appropriate instrumental variable. Therefore, the BC method is adopted to obtain an unbiased estimate for the NRM in this paper [Citation10].

This paper considers the parameter identification of the so-called ARX-NRM. This NRM contains a process model with nonlinear rational form and its noise model has the same denominator as the process model. This paper investigates this parameter identification in the time domain, without considering modelling error, time-delay estimation, uncertainties in the real systems and frequency-domain identification [Citation26–29]. Meanwhile, this paper discusses the integer adaptive methods for the nonlinear models. Recently, the fractional adaptive method has been an important part of adaptive algorithms. For details, please see [Citation30–33].

The main contributions of this paper are as follows:

The parameter identification of an ARX-type nonlinear rational model is considered, which is quite difficult because this model is a nonlinear-in-parameter system and its output is coupled with the input.
To accelerate the stochastic gradient algorithm, the multi-innovation technique is integrated into the algorithm, in which the scalar innovation is replaced by the innovation vector.
The bias of the multi-innovation stochastic gradient algorithm is calculated by the observations and the previous estimates, and then compensated to get an unbiased estimate.
The proposed algorithm is validated by numerical examples and case study. Results indicate that the proposed algorithm can obtain accurate estimates with a fast converge speed.

The rest of this work is organized as follows. In Section 2, the ARX-NRM to be estimated is described. Then an unbiased multi-innovation SG algorithm is presented in Section 3. In the next section, the performance of the proposed algorithm is analysed. Numerical examples and case study are adopted to validate the proposed algorithm in Section 5. Finally, conclusion is summarized in Section 6.

2. Problem description

Consider an ARX-NRM depicted in Figure , where $u (k)$ , $y (k)$ and $v (k)$ are the input, output and noise, respectively. $f (k)$ and $g (k)$ are two nonlinear polynomials concerning $y (k - i)$ and/or $u (k - j)$ , $i, j = 1, 2, \dots, r$ .

Figure 1. Block diagram of an ARX-NRM.

From Figure , we can express the output $y (k)$ by (1) $\begin{aligned} y (k) = \frac{f (k)}{1 + g (k)} + \frac{1}{1 + g (k)} v (k), \end{aligned}$ (1) where (2) $\begin{aligned} {\begin{array}{l} f (k) = f (y (k - 1), \dots, y (k - r), u (k - 1), \dots, u (k - r)), \\ g (k) = g (y (k - 1), \dots, y (k - r), u (k - 1), \dots, u (k - r)) . \end{array} \end{aligned}$ (2)

It can be seen from Figure and Equation (1) that the structure of this NRM is similar to that of the ARX model, $y (k) = \frac{B (z^{- 1})}{1 + A (z^{- 1})} u (k) + \frac{1}{1 + A (z^{- 1})} v (k)$ [Citation21]. The difference is that the numerator $f (k)$ and denominator $g (k)$ of the NRM are both nonlinear polynomials, and the input $u (k - i)$ is implicit in the nonlinear transfer function. Thus the NRM in Figure is named ARX-NRM.

Multiplying both sides of Equation (1) by $1 + g (k)$ yields (3) $\begin{aligned} (1 + g (k)) y (k) = f (k) + v (k) \end{aligned}$ (3) with (4) $\begin{aligned} {\begin{array}{l} f (k) = b_{1} ψ_{1} (k) + \dots + b_{n_{b}} ψ_{n_{b}} (k) \\ g (k) = a_{1} ϕ_{1} (k) + \dots + a_{n_{a}} ϕ_{n_{a}} (k) \end{array} \end{aligned}$ (4) and $ϕ_{i} (k)$ , $ψ_{j} (k)$ are scalars with the forms of $u^{j} (k - i), y^{j} (k - i)$ , $u^{m} (k - i) y^{l} (k - j)$ , etc. Then we can parameterize the ARX-NRM as follows: (5) $\begin{aligned} y (k) = - y (k) g (k) + f (k) + v (k) = φ^{T} (k) θ + v (k), \end{aligned}$ (5) where (6) $\begin{aligned} {\begin{array}{l} θ = {[θ_{a}^{T}, θ_{b}^{T}]}^{T} \in R^{n \times 1}, \\ φ (k) = {[- y (k) φ_{a}^{T} (k), φ_{b}^{T} (k)]}^{T} \in R^{n \times 1}, \\ n = n_{a} + n_{b}, \\ φ_{a} (k) = {[ϕ_{1} (k), \dots, ϕ_{n_{a}} (k)]}^{T} \in R^{n \times 1}, \\ φ_{b} (k) = {[ψ_{1} (k), \dots, ψ_{n_{b}} (k)]}^{T} \in R^{n \times 1}, \\ θ_{a} = {[a_{1}, \dots, a_{n_{a}}]}^{T} \in R^{n_{a} \times 1}, \\ θ_{b} = {[b_{1}, \dots, b_{n_{b}}]}^{T} \in R^{n_{b} \times 1} . \end{array} \end{aligned}$ (6)

The identification of the ARX-NRM in Figure is transformed into the estimation of the parameter vector $θ$ based on the observations ${u (k), y (k)}_{k = 1}^{N}$ , where $N$ is the data length.

3. Identification algorithm

3.1. Stochastic gradient (SG) algorithm

Consider the parameterized system in Equation (5) and denote the error $e (k)$ as (7) $\begin{aligned} e (k) = y (k) - φ^{T} (k) \hat{θ} (k - 1), \end{aligned}$ (7) where $\hat{θ} (k - 1)$ is the parameter estimate at time $k - 1$ . Sometimes, $e (k)$ is also called innovation [Citation16].

Define a cost function as follows: (8) $\begin{aligned} V (k) = \frac{1}{2} e^{2} (k) . \end{aligned}$ (8) The SG algorithm updates the parameter estimate along the negative gradient direction of the criterion function $V (k)$ until $V (k)$ reaches the minimum. The stochastic gradient of $V (k)$ concerning $θ$ at time $k$ is (9) $\begin{aligned} g (k) = - φ (k) e (k) = - φ (k) (y (k) - φ^{T} (k) \hat{θ} (k - 1)) . \end{aligned}$ (9)

The SG algorithm for identification of $θ$ is as follows [Citation20]: (10) $\begin{aligned} \hat{θ} (k) = \hat{θ} (k - 1) - η (k) g (k), \end{aligned}$ (10) where $η (k)$ is the variable step size that can be calculated by [Citation16] (11) $\begin{aligned} η (k) & = \frac{1}{r (k)}, r (k) = λ r (k - 1) + φ^{T} (k) φ (k), \\ r (0) & = 1, λ \in (0, 1] . \end{aligned}$ (11)

3.2. Multi-Innovation SG (MI-SG) algorithm

Although the SG algorithm costs less calculation than the least-squares algorithm, it converges slowly. To accelerate the algorithm, multi-innovation is introduced [Citation16]. Replacing the single innovation $e (k)$ in Equation (9) with the multi-innovation vector $E_{p} (k)$ , and replacing the single information vector $φ (k)$ with the information matrix $Φ_{p} (k)$ gives a stacking gradient $G_{p} (k)$ as follows: (12) $\begin{aligned} {\begin{array}{l} G_{p} (k) = - Φ_{p} (k) E_{p} (k) \in R^{n \times 1}, \\ Φ_{p} (k) = [φ (k), φ (k - 1), \dots, φ (k - p + 1)] \in R^{n \times p}, \\ E_{p} (k) = {[\begin{matrix} e (k), e (k - 1), \dots, e (k - p + 1) \end{matrix}]}^{T} \in R^{p \times 1}, \end{array} \end{aligned}$ (12) where $p$ is the stacking length, $G_{p} (k)$ , $E_{p} (k)$ and $Φ_{p} (k)$ are the stacked gradient, stacked innovation vector, and stacked information matrix, respectively. Expanding $G_{p} (k)$ yields (13) $\begin{aligned} G_{p} (k) = - \sum_{i = k - p + 1}^{k} φ (i) e (i) = - \sum_{i = k - p + 1}^{k} g (i), \end{aligned}$ (13) where $g (i)$ denotes the gradient at time $i$ .

It can be seen from Equation (13) that the stacked gradient $G_{p} (k)$ is the sum of recent $p$ gradients, and it can also be regarded as the weighted information vectors, and the weighting coefficient is the innovation at the corresponding time. In short, this summation or weighting increases the size of the gradient, modifies the direction of the gradient and is conducive to accelerating the gradient algorithm. This gradient algorithm using multi-innovation is called multi-innovation SG (MI-SG) algorithm.

Consider Equation (13), the SG estimator Equation (10) is rewritten as (14) $\begin{aligned} \hat{θ} (k) = \hat{θ} (k - 1) - η (k) G_{p} (k) . \end{aligned}$ (14) Equations (7)–(14) except Equation (10) construct the MI-SG algorithm.

3.3. Bias compensated MI-SG (BC-MI-SG) algorithm

Let us study the properties of the parameter estimate given by the MI-SG algorithm.

Considering Equations (7) and (9), the stacked gradient $G_{p} (k)$ in Equation (12) is rewritten as (15) $\begin{aligned} G_{p} (k) & = - \sum_{i = k - p + 1}^{k} φ (i) e (i) \\ = - \sum_{i = k - p + 1}^{k} φ (i) (y (i) - φ^{T} (i) \hat{θ} (i - 1)) \\ = - \sum_{i = k - p + 1}^{k} φ (i) (φ^{T} (i) θ_{0} \\ + v (i) - φ^{T} (i) \hat{θ} (i - 1)) \\ = - \sum_{i = k - p + 1}^{k} (φ (i) φ^{T} (i) θ_{0} + φ (i) v (i) \\ - φ (i) φ^{T} (i) \hat{θ} (i - 1)) \\ = - \sum_{i = k - p + 1}^{k} Ω θ_{0} - \sum_{i = k - p + 1}^{k} T (i) \\ + \sum_{i = k - p + 1}^{k} Ω \hat{θ} (i - 1), \end{aligned}$ (15) where $T (i) = φ (i) v (i)$ , $Ω = φ (i) φ^{T} (i)$ and $θ_{0}$ denote the true parameter vector.

Considering Equations (15) and (14) becomes (16) $\begin{aligned} \hat{θ} (k) & = \hat{θ} (k - 1) + η (k) \sum_{i = k - p + 1}^{k} Ω θ_{0} + η (k) \\ \times \sum_{i = k - p + 1}^{k} T (i) - η (k) \sum_{i = k - p + 1}^{k} Ω \hat{θ} (i - 1) . \end{aligned}$ (16) Taking expectation on both sides of Equation (16) yields (17) $\begin{aligned} E {\hat{θ} (k)} & = E {\hat{θ} (k - 1)} + η (k) \sum_{i = k - p + 1}^{k} Ω θ_{0} \\ + η (k) E {\sum_{i = k - p + 1}^{k} T (i)} \\ - η (k) E {\sum_{i = k - p + 1}^{k} Ω \hat{θ} (i - 1)} . \end{aligned}$ (17) Supposing $Ω$ is not related to $\hat{θ} (i - 1)$ , and using the conditional expectation formula [Citation20] (18) $\begin{aligned} E {f (x) y} = E {E {f (x) | y} y}, \end{aligned}$ (18) The fourth item on the right side of Equation (17) can be written as (19) $\begin{aligned} - η (k) E {\sum_{i = k - p + 1}^{k} Ω \hat{θ} (i - 1)} \\ = - η (k) Ω E {\sum_{i = k - p + 1}^{k} \hat{θ} (k - 1)} . \end{aligned}$ (19) Equation (17) becomes (20) $\begin{aligned} E {\hat{θ} (k)} & = E {\hat{θ} (k - 1)} + η (k) Ω \sum_{i = k - p + 1}^{k} θ_{0} + η (k) \\ \times \sum_{i = k - p + 1}^{k} E {T (i)} - η (k) Ω \\ \times \sum_{i = k - p + 1}^{k} E {\hat{θ} (i - 1)} . \end{aligned}$ (20) When $k \to \infty$ , $E {\hat{θ} (k)} = E {\hat{θ} (k - 1)}$ , we have (21) $\begin{aligned} lim_{k \to \infty} \sum_{i = k - p + 1}^{k} E {\hat{θ} (i - 1) - θ_{0}} \\ = Ω^{- 1} \sum_{i = k - p + 1}^{k} E {T (i)} \neq 0 . \end{aligned}$ (21) It can be seen from Equation (21) that the parameter estimate obtained by the MI-SG algorithm is biased. We can find that the bias $Ω^{- 1} \sum_{i = k - p + 1}^{k} E {T (i)}$ is caused by the $η (k) \sum_{i = k - p + 1}^{k} T (i)$ on the right of Equation (16). To obtain an unbiased estimate, this term must be subtracted from the right of Equation (14), i.e. (22) $\begin{aligned} \hat{θ} (k) = \hat{θ} (k - 1) - η (k) G_{p} (k) - η (k) \sum_{i = k - p + 1}^{k} T (i) . \end{aligned}$ (22) However, $T (i)$ cannot be calculated by $φ (i) v (i)$ because of the unknown $v (i)$ . For the ARX-NRM in Figure , considering Equation (6) and replacing the unknown $v (i)$ by $e (i)$ , $T (i)$ is calculated by (23) $\begin{aligned} T (i) = φ (i) v (i) & = [\begin{matrix} - y (i) v (i) φ_{a} (i) \\ v (i) φ_{b} (i) \end{matrix}] \\ = [\begin{matrix} - y (i) v (i) φ_{a} (i) \\ 0_{n_{b} \times 1} \end{matrix}] \\ \approx [\begin{matrix} - y (i) e (i) φ_{a} (i) \\ 0_{n_{b} \times 1} \end{matrix}] \end{aligned}$ (23) Substituting Equation (23) into Equation (22) yields (24) $\begin{aligned} \hat{θ} (k) & = \hat{θ} (k - 1) - η (k) G_{p} (k) + η (k) \\ \times \sum_{i = k - p + 1}^{k} y (i) e (i) [\begin{matrix} φ_{a} (i) \\ 0_{n_{b} \times 1} \end{matrix}], \end{aligned}$ (24) Equations (7)–(13) and Equation (24) (except Equation (10)) construct the bias compensated MI-SG (BC-MI-SG) algorithm.

4. Performance analysis

4.1. Computational analysis

The calculation costs of each iteration of the SG, MI-SG, RLS and BC-MI-SG algorithms are shown in Table . It is seen that:

The computational burden of the three SG algorithms is all $O (n)$ .
The MI-SG and the BC-MI-SG algorithm cost more computations than the SG algorithm because of the multi-innovation.
The proposed algorithm costs less computation than the recursive least squares (RLS) algorithm, whose complexity is $O (n^{2})$ [Citation9,Citation16].

Table 1. Computational costs of the SG, MI-SG, BC-MI-SG, and RLS algorithms.

Display Table

4.2. Convergence analysis

According to Section 3.3, for the proposed BC-MI-SG algorithm, Equation (17) becomes (25) $\begin{aligned} E {\hat{θ} (k)} & = E {\hat{θ} (k - 1)} + η (k) \sum_{i = k - p + 1}^{k} Ω θ_{0} \\ - η (k) E {\sum_{i = k - p + 1}^{k} Ω \hat{θ} (i - 1)} . \end{aligned}$ (25) When $k \to \infty$ , $E {\hat{θ} (k)} = E {\hat{θ} (k - 1)}$ , Equation (25) can be rewritten as (26) $\begin{aligned} lim_{k \to \infty} \sum_{i = k - p + 1}^{k} E {\hat{θ} (i - 1) - θ_{0}} = 0 . \end{aligned}$ (26) Consider $lim_{k \to \infty} E {\hat{θ} (k - j)} = E {\hat{θ} (k)}, j = 0, 1, \dots, p - 1$ , Equation (26) is rewritten as (27) $\begin{aligned} p lim_{k \to \infty} E {\hat{θ} (k) - θ_{0}} = 0, \end{aligned}$ (27) which means $lim_{k \to \infty} E {\hat{θ} (k)} = θ_{0}$ .

5. Experiment results

5.1. Numerical example

Consider an ARX-NRM in Equation (1) with (28) $\begin{aligned} {\begin{array}{l} f (k) = 0.5 y (k - 1) u (k - 2) + u (k - 1), \\ g (k) = 1 + u^{2} (k - 1) + y^{2} (k - 1), \end{array} \end{aligned}$ (28) where the input $u (k)$ is a Gaussian signal with mean zero and variance ${1.0}^{2}$ . A noise $v (k)$ with mean zero is added to the model. 600 observations are collected and depicted in Figure . The initial value of each entry of the parameter vector is set to $1 \times 10^{- 6}$ and the estimation error is defined as $δ (%) = \frac{‖ \hat{θ} (N) - θ_{0} ‖}{‖ θ_{0} ‖} \times 100$ .

Results using BC-MI-SG, MI-SG, SG and RLS algorithms

Figure 2. Curves of the observed data.

The parameter estimates using the proposed BC-MI-SG are shown in Table , and the estimation errors are depicted in Figure , where $σ^{2} = {0.01}^{2}, λ = 0.4, p = 3$ . For comparison, the parameter estimates given by the SG, RLS and MI-SG algorithms are also listed in Table , and the estimation errors using the last three algorithms are also depicted in Figure .

Figure 3. Estimation errors using the SG, MI-SG and BC-MI-SG algorithms.

Table 2. Estimates using the SG, MI-SG, RLS and BC-MI-SG algorithms.

Display Table

It can be seen that:

In Figure , all curves decrease when k increases, which means that the estimation errors of the three algorithms become small with the new data being used.
The error curves of the two algorithms with MI are almost the same, which are far lower than that of the SG algorithm. In other words, the parameter estimates given by the two MI-SG algorithms are more accurate than those given by the SG algorithm.
Among the two curves with MI, the curve of the proposed BC-MI-SG algorithm is at the bottom, which shows that the estimation error given by the proposed algorithm is smaller. That is to say, the bias compensation can improve the estimation accuracy of the MI-SG algorithm.
In the second half of identification, the curve of the RLS algorithm has little difference with the proposed algorithm, which implies the RLS algorithm can give an accurate estimate for the ARX-NRM. However, the RLS algorithm costs too much computation to prevent its application in some situations that needs fast identification (see Table for more details).
Results using the BC-MI-SG algorithm with different noise variances

To show the performance of the proposed algorithm under different noise levels, we estimate the ARX-NRM in Equation (28) using the BC-MI-SG algorithm with

σ^{2} = {0.01}^{2}, {0.02}^{2}, {0.04}^{2}

. The estimation errors using different

σ^{2}

are depicted in Figure .

Figure 4. Estimation errors using BC-MI-SG with different noise variances.

It can be seen that:

For a given noise variance $σ^{2}$ , the overall trend of the estimation error decreases with the increase of $k$ .
When the variance is small, the curve of the estimation error is relatively smooth. With the increase of variance, the fluctuation of the error curve increases.
The estimation error of $σ^{2} = {0.01}^{2}$ is smaller than those of the $σ^{2} = {0.02}^{2}$ and $σ^{2} = {0.04}^{2}$ . That is, a larger $σ^{2}$ is not conducive to the improvement of the estimate's accuracy.

5.2. Case study

A chemical model describing propylene catalytic oxidation with the following structure [Citation5,Citation15,Citation34] is used to validate the proposed algorithm, $\begin{aligned} y (k) = \frac{b C_{p} (k)}{1 + a \frac{C_{p} (k)}{C_{o}^{0.5} (k)}} + \frac{v (k)}{1 + a \frac{C_{p} (k)}{C_{o}^{0.5} (k)}} \end{aligned}$ where two inputs $C_{o} (k)$ and $C_{p} (k)$ are the oxygen and propylene concentrations at time $k$ respectively. The rate of disappearance of propylene $y (k)$ is taken as the output variable. The true values are $a_{0} = 0.231, b_{0} = 7.28 \times 10^{- 4}$ . The inputs $C_{o} (k)$ and $C_{p} (k)$ are taken as random integers between $(1, 100)$ and between $(1, 10)$ respectively, ${v (k)}$ is taken as a white noise sequence with mean zero and variance $σ^{2} = {0.1}^{2}$ . The curve of 600 observed data is shown in Figure .

Figure 5. Curve of the propylene catalytic oxidation data.

Following ARX-NRM structure is used: $\begin{aligned} y (k) & = \frac{f (k)}{g (k)} + \frac{1}{g (k)} v (k) with \\ f (k) & = b u_{p} (k), g (k) = 1 + a \frac{u_{p} (k)}{u_{o} (k)} . \end{aligned}$ Estimate using proposed BC-MI-SG algorithm with $λ = 0.4, p = 3$ is listed in Table , where the estimation error is calculated by the following formula: $δ (N) = (‖ \frac{a_{0} - \hat{a} (N)}{a_{0}} ‖ + ‖ \frac{b_{0} - \hat{b} (N)}{b_{0}} ‖) \times 100$ .

Table 3. Results using the SG, MI-SG and BC-MI-SG algorithms for the propylene catalytic oxidation data.

Display Table

For comparison, the estimates of the SG and MI-SG algorithms are also shown in Table . It is easy to find that the estimation error of the proposed algorithm is the smallest one among the three algorithms, which means the model obtained by the BC-MI-SG algorithm is the most accurate model of the three.

6. Conclusion

To identify the parameters of an ARX-NRM, a bias compensated multi-innovation stochastic gradient algorithm is presented. To accelerate traditional stochastic gradient algorithm, a multi-innovation is integrated into the algorithm. The multi-innovation technique replaces the scalar innovation in the SG algorithm with an information vector. Theoretical analysis shows that the MI-SG algorithm gives a biased estimate because the output contained in the information vector is correlated to the noise. To get an unbiased estimate, the bias is calculated firstly and then compensated to the MI-SG algorithm. The proposed algorithm is validated by numerical experiments and the modelling of the propylene catalytic oxidation. Results indicate that the proposed algorithm can give accurate estimates using less computation.

Data availability statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the National Natural Science Foundation of China under Grant 62003150, the Natural science research project of colleges and universities in Jiangsu in Jiangsu under Grant 21KJB470017 and the National Scholarship Foundation of China under Grant 201908320048.

References

Becerra JA, Ayora MM, Reina-Tosina J, et al. Sparse identification of Volterra models for power amplifiers without pseudoinverse computation. IEEE Trans Microwave Theory Tech. 2020;68(11):4570–4578.
Google Scholar
Jing S. Identification of a deterministic Wiener system based on input least squares algorithm and direct residual method. Int J Model Ident Control. 2020;34(3):208–216.
Google Scholar
Obeid S, Ahmadi G, Jha R. NARMAX identification based closed-loop control of flow separation over NACA 0015 airfoil. Fluids. 2020;5(3):100.
Google Scholar
Wu H, Chen SX, Zhang YH, et al. Robust bearings-only tracking algorithm using structured total least squares-based Kalman filter. Automatika. 2015;563:275–280.
Google Scholar
Chen J, Ding F, Zhu Q, et al. Maximum likelihood based identification methods for rational models. Int J Syst Sci. 2019;50(14):2579–2591.
Web of Science ®Google Scholar
Kambhampati C, Mason JD, Warwick K. A stable one-step-ahead predictive control of non-linear systems. Automatica (Oxf). 2000;36(4):485–495.
Web of Science ®Google Scholar
Mu B, Bai E, Zheng W, et al. A globally consistent nonlinear least squares estimator for identification of nonlinear rational systems. Automatica (Oxf). 2017;77:322–335.
Web of Science ®Google Scholar
Zhu Q, Wang Y, Zhao D, et al. Review of rational (total) nonlinear dynamic system modelling, identification, and control. Int J Syst Sci. 2015;46(12):2122–2133.
Web of Science ®Google Scholar
Zhu Q, Billings SA. Parameter estimation for stochastic nonlinear rational models. Int J Control. 1993;57(2):309–333.
Web of Science ®Google Scholar
Zhu Q, Billings SA. Recursive parameter estimation for nonlinear rational models. Dept of Automatic Control and System Engineering. Sheffield: University of Sheffield; 1991.
Google Scholar
Billings SA, Chen S. Identification of non-linear rational systems using a prediction-error estimation algorithm. Int J Syst Sci. 1989;20(3):467–494.
Web of Science ®Google Scholar
Mao K, Billings SA, Zhu Q. A regularised least squares algorithm for nonlinear rational model identification. Department of Automatic Control and Systems Engineering. Sheffield: University of Sheffield. 1996.
Google Scholar
Billings SA, Zhu Q. Rational model identification using an extended least-squares algorithm. Int J Control. 1991;54(3):529–546.
Web of Science ®Google Scholar
Zhu Q. A back propagation algorithm to estimate the parameters of non-linear dynamic rational models. Appl Math Model. 2003;27(3):169–187.
Web of Science ®Google Scholar
Zhu Q. An implicit least squares algorithm for nonlinear rational model parameter estimation. Appl Math Model. 2005;29(7):673–689.
Web of Science ®Google Scholar
Ding F. System identification New theory and methods. Beijing: Science Press; 2013.
Google Scholar
Wang J, Ji Y, Zhang C. Iterative parameter and order identification for fractional-order nonlinear finite impulse response systems using the key term separation. Int J Adapt Control Signal Process. 2021;35(8):1562–1577.
Web of Science ®Google Scholar
Ji Y, Kang Z. Three-stage forgetting factor stochastic gradient parameter estimation methods for a class of nonlinear systems. Int J Robust Nonlinear Control. 2021;31(3):971–987.
Google Scholar
Xu L, Ding F, Yang E. Auxiliary model multiinnovation stochastic gradient parameter estimation methods for nonlinear sandwich systems. Int J Robust Nonlinear Control. 2021a;31(1):148–165.
Web of Science ®Google Scholar
Fang C, Xiao D. Process identification. Beijing: Tsinghua University Press; 1988.
Google Scholar
Ljung L. System identification (2nd Ed.): Theory for the user. USA: Prentice Hall PTR; 1999.
Google Scholar
Ouyang T, Xu P, Chen J, et al. A novel state of charge estimation method for lithiumion batteries based on bias compensation. Energy. 2021;226:120348.
Google Scholar
Ni J, Gao Y, Chen X, et al. Bias-compensated sign algorithm for noisy inputs and its step-size optimization. IEEE Trans Signal Process. 2021;69:2330–2342.
Google Scholar
Grace JB. Instrumental variable methods in structural equation models. Methods Ecol Evol. 2021;12(7):1148–1157.
Google Scholar
Jin S, Yang-Wallentin F, Bollen KA. A unified model-implied instrumental variable approach for structural equation modeling with mixed variables. Psychometrika. 2021;86(2):564–594.
Google Scholar
Filipovic V, Nedic N, Stojanovic V. Robust identification of pneumatic servo actuators in the real situations. Forsch Ingenieurwes. 2011;75(4):183–196.
Google Scholar
Tao H, Li X, Paszke W, et al. Robust PD-type iterative learning control for discrete systems with multiple time-delays subjected to polytopic uncertainty and restricted frequency-domain. Multidimension Syst Signal Process. 2021;32(2):671–692.
Google Scholar
Xu Z, Li X, Stojanovic V. Exponential stability of nonlinear state-dependent delayed impulsive systems with applications. Nonlinear Analysis: Hybrid Systems. 2021b;42:101088.
Web of Science ®Google Scholar
Zhang X, Wang H, Stojanovic V, et al. Asynchronous fault detection for interval type-2 fuzzy nonhomogeneous higher-level Markov jump systems with uncertain transition probabilities. IEEE Trans Fuzzy Syst. 2021. doi:https://doi.org/10.1109/TFUZZ.2021.3086224
Google Scholar
Chaudhary NI, Raja MAZ, Khan AUR. Design of modified fractional adaptive strategies for Hammerstein nonlinear control autoregressive systems. Nonlinear Dyn. 2015;82(4):1811–1830.
Google Scholar
Chaudhary NI, Raja MAZ, He Y, et al. Design of multi innovation fractional LMS algorithm for parameter estimation of input nonlinear control autoregressive systems. Appl Math Model. 2021;93:412–425.
Web of Science ®Google Scholar
Khan AA, Shah SM, Raja MAZ, et al. Fractional LMS and NLMS algorithms for line echo cancellation. Arab J Sci Eng. 2021;46: 9385–9398.
Google Scholar
Raja MAZ, Chaudhary NI. Two-stage fractional least mean square identification algorithm for parameter estimation of CARMA systems. Signal Process. 2015;107:327–339.
Web of Science ®Google Scholar
Dimitrov SD, Kamenski DI. A parameter estimation method for rational functions. Comput Chem Eng. 1991;15(9):657–662.
Google Scholar

Identification of a nonlinear rational model based on bias compensated multi-innovation stochastic gradient algorithm

Abstract

1. Introduction

2. Problem description