2,037
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Disturbance observer-based fuzzy adaptive optimal finite-time control for nonlinear systems

, &
Article: 2199211 | Received 29 Dec 2022, Accepted 30 Mar 2023, Published online: 14 Apr 2023

Abstract

This paper investigates the issue of disturbance observer-based fuzzy adaptive optimal finite-time control in light of the backstepping approach for strict-feedback nonlinear systems with bias fault term and full state constraints. Considering that external disturbance and bias fault signal can affect the stability of control and control quality, a disturbance observer is constructed to track the external disturbances and bias fault online. A disturbance observer-based finite-time control strategy is proposed to achieve optimized control by utilizing the fuzzy logic system approximation-based adaptive dynamic programming method under the critic-actor framework. The purpose of the critic is to evaluate control performance and the role of the actor is to execute control behaviour. In addition, it is proved that the proposed fuzzy adaptive optimal finite-time scheme not only realizes all signals in closed-loop system are bounded, but also ensures that system states are restricted within specific sets. Finally, simulation results are shown to demonstrate the effectiveness of the proposed control strategy.

Mathematics Subject Classifications:

1. Introduction

In the past few decades, the research on controller designing for nonlinear systems speedy developed neural networks and fuzzy approximation characteristics [Citation1–4]. The incipient control strategies facilitate the control objectives for nonlinear systems are realized in a infinite time, which means that the stability of systems fails to be guaranteed in a finite time. However, in practical problems, it is imperative to achieve superior control performance in a finite time.

Dissimilar to asymptotic stability, finite-time stability converges fast without the requirement of long-term transient response, so it is widely concerned and applied to practical systems such as aircraft flight systems [Citation5], cart-pole systems [Citation6], teleoperation systems [Citation7] and so on. Bhat and Bernstein [Citation8] explained finite-time stability based on Lyapunov theory and constructed feedback controller for nonlinear systems to achieve finite-time stability. Qiu et al. [Citation9] established a fuzzy finite-time controller for nonlinear systems and realized the tracking error is limited to a bounded set. Li et al. [Citation10] studied the problem of finite-time containment control involving unmeasurable states for nonlinear multiagent systems with input delay. Zhang et al. [Citation11] designed an adaptive fuzzy control strategy based on finite-time stability criterion for nonlinear systems with unavailable states, in which all states are confined to restricted sets. Meng et al. [Citation12] developed a finite-time quantized controller for dealing with the nonlinear systems contingent on unknown control directions, which constructed state and disturbance observers simultaneously. Sui et al. [Citation13] constructed an event-trigger adaptive finite control scheme by combining backstepping and varying threshold condition for stochastic nonlinear systems with unmodelled dynamics. Saravanakumar et al. [Citation14] first investigated the finite-time stability problem. The authors in [Citation15–17] studied the issue of finite-time control for nonlinear systems in accordance with full-state constraints. Nguyen et al. [Citation18] introduced a fuzzy control strategy for parallel manipulators to guarantee that the error of tracking converges fast in a finite time. Wang et al. [Citation19] constructed a finite-time control algorithm for stochastic nonlinear systems with actuator faults. Nevertheless, the above literature take the finite time into account and the designed control strategies ensure that the control objective can be achieved within the finite time. However, these literature do not involve the optimal control to deal with the problem of resource consumption.

In recent years, optimal control has been a striking topic, and it is concerned with establishing control strategies to achieve control objectives with the least resources in regard to optimal policy. In other words, the goal of optimal control is to consume the least amount of resources to achieve the control objective. For nonlinear systems, it is noteworthy that the resolve of Hamilton–Jacobi–Bellman (HJB) equation is imperative in the control process, but it is problematic to get analytic solutions because of the dynamic uncertainty and strong nonlinearity in the actual control. To overcome this problem, the dynamic programming (DP) was developed by Bellman [Citation20]. Noted that although DP is an effective tool for obtaining optimal solutions, it is prone to the curse of dimensionality, that is, increasing dimensions bring computational disasters. Adaptive dynamic programming (ADP)-based algorithms were demonstrated to efficiently conquer the problem. Werbos [Citation21] developed ADP for discrete systems and Abu-Khalaf and Lewis [Citation22] proposed ADP control scheme for continue-time nonlinear systems where the neural networks approximation structure was established to estimate the value function of the HJB equation. Vamvoudakis et al. [Citation23] developed an online actor-critic scheme combined with neural networks for continue-time systems. Wen et al. [Citation24] designed an optimized formation control strategy in reference with identifier-actor-critic structure and approximation characteristic of fuzzy logic systems. Li and Li [Citation25] introduced a fuzzy adaptive optimal fault-tolerant control scheme for stochastic multiagent systems to incorporate Butterworth low-pass filter, which leverages to compensate for negative influence caused by nonlinear fault on the system. Wen et al. [Citation26] reviewed the fuzzy adaptive optimized control issue subject to unmeasured states with nonlinear systems, in which the state observer satisfies the Hurwitz condition that sidestep constant designing. Lan et al. [Citation27] introduced an adaptive optimal formation control technology for multiagent systems with unmeasurable states. Li et al. [Citation28] designed a neural network adaptive optimized control algorithm exposing to immeasurable states and constrained states with nonlinear systems. The authors in [Citation29–31] discussed the problem of adaptive optimal control for nonlinear systems. Wen et al. [Citation32] built an adaptive optimal control scheme by virtue of identifier–critic–actor architecture. To achieve high precision control requirements, it is necessary to consider the influence of external disturbance on the system, which makes sense to design an optimal finite-time control strategy for the controlled system with external disturbance. However, the proposed approaches in [Citation20–32] ignored the external disturbance in the controller system.

It is a prominent prospect to eliminate the influence of external disturbances on the systems [Citation33–38]. Ji et al. [Citation39] designed an adaptive fault-tolerant optimized formation control for multiagent systems with disturbances, in which a disturbance observer was constructed to alleviate the negative effects of external disturbance. Liu et al. [Citation40] developed an adaptive control scheme for Markovian jump systems that suffer from external disturbances and actuator faults, in which external disturbances contain matched and mismatched part. Xu et al. [Citation41] designed a neural network disturbance observer for strict-feedback systems to achieve good control performance in spite of unknown dynamics and time-varying disturbance. Song and Lewis [Citation42] introduced a robust optimal control scheme with a disturbance observer to estimate disturbances for nonlinear systems. Zerari and Chemachema [Citation43] studied the continuously stirred tank reactor systems containing external disturbances and introduced the compensator to confront the influence of external interference in the designed control strategy. Ran et al. [Citation44] proposed a disturbance rejection optimal control for nonlinear systems, in which the perturbances and other uncertainties are estimated by the designed disturbance observer. Chen and Ge [Citation45] presented an adaptive neural control strategy for nonlinear systems with unmeasured states, hysteresis and disturbances. However, the foregoing advancements rarely studied the optimized finite-time control problem based on fuzzy systems for nonlinear systems with full-state constraints and external disturbances.

Motivated by the aforementioned researches, this paper considers the fuzzy adaptive optimal finite-time control issue for nonlinear systems with full-state constraints, external disturbance and bias fault. Combining backstepping and ADP tricks, a fuzzy adaptive optimal finite-time control strategy is designed. The unknown system functions are approximated by the fuzzy logic system and disturbance observers are designed to solve the influence of external disturbances and bias fault signal. Virtual and actual controllers are introduced by being incorporated with actor-critic architecture and backstepping framework. The main contributions are summarized as follows.

  1. In comparison with  [Citation25,Citation27,Citation28], the designed scheme via actor-critic framework and finite-time stability theory can ensure the controller systems not only has a faster convergence rate and limits the tracking error derives within a temperate area of the origin in a finite time.

  2. An disturbance observer-based fuzzy finite-time strategy based on actor-critic structure is proposed. Discriminating the control schemes in [Citation26,Citation32,Citation46], it not only considers the full-state constraint for strict-feedback nonlinear systems but also addresses the effects of bias fault and external disturbance on the system by designing the disturbance observer.

The remainder of this article is organized as follows. In Section 2, the system description and preliminary knowledge are presented. Section 3 proposes an observer-based fuzzy optimal control scheme. Subsequently, stability analysis is given in Section 4. A simulation example is shown in Section 5. Finally, the conclusion is presented in Section 6.

2. Preliminaries and problem statement

2.1. System description

The strict-feedback nonlinear system consisting of unknown dynamics is formulated by (1) {x˙i=xi+1+ϕi(x¯i)+D1(t),x˙n=uA+ϕn(x¯n)+Dn(t),y(t)=x1,1in,(1) where x¯i=[x1,x2,,xi]TRi (i=1,,n) represent the states vector of the system, yR is the control output. ϕi(x¯i) Ri are the unknown functions and Di(t) denote the external disturbances where i=1,,n. All system states are restricted to a compact set, that is, xi<|kic| where |kic| are positive constants with i=1,2,,n.

In actual engineering applications, the actuator bias faults often occur during the operation of the actuator. Thus we consider the actuator bias fault signal as (2) uA=u+uf,(2) where u denotes the control input and uf represents the actuator bias fault signal. Suppose uf is bounded and there is a constant F that ufF.

Assumption 2.1

[Citation39]

The external disturbances Di(t) are unknown and bounded, i.e. there exist real numbers Di and D¯i satisfying |D˙i(t)|Di and Di(t)D¯i where i=1,2,,n.

Assumption 2.2

[Citation25]

The reference trajectory yr is known and bounded, and the derivatives of the yr, y˙ryrn are bounded.

2.2. Finite-time theory

The following lemmas are beneficial to facilitate the design of the controller.

Lemma 2.1

[Citation15]

For any positive numbers C1>0,C2>0, 0<δ<1, 0<l<1, there exist a C2 function V(x) satisfying (3) V(x)C1VδC2V+M,(3) then the system is finite-time stable, where the setting time T01C2(1δ)lnC2V1δ(x0)+lC1C2(M(1l)C1)1δδ+lC1.

Lemma 2.2

[Citation29]

For any τR, there exist an integer constant ξ and a positive constant 0<q1, one has (4) (i=1n|τ|)qi=1n|τ|qn1q(i=1n|τ|)q.(4)

Lemma 2.3

[Citation29]

There exists real variables p1 and p2 satisfying the following inequality: (5) |p1|r1|p2|r2r1r1+r2r3|p1r1+r2+r2r1r3r1r2|p2|r1+r2.(5) where r1, r2 and r3 are positive constants.

2.3. Fuzzy logic system

A host of real-world systems exit unknown dynamics which affect the control performance, and fuzzy logic system approximation approach facilitates to solve the negative effect problem of unknown dynamics. The knowledge base constitutes IF–THEN rules which are stated in the following forms: Rk: If x1 is G1k,x2 is G2k,,xn is Glk,then y is Qk,k=1,2,,N,where x=(x1,x2,,xl)TRl and y are the input and output of fuzzy logic system. Glk and Qk represent fuzzy sets, and N denotes the number of rules. Then the fuzzy logic system can be described by (6) y(x)=k=1Ny¯ki=1lμGik(xi)k=1Ni=1lμGik(xi),(6) where μGik and μQk are fuzzy membership functions, and y¯k=maxyRμQk(y).

The basis functions are defined as (7) Sk(x)=i=1lμGik(xi)k=1Ni=1lμGik(xi).(7) Denote =(1,2,,N)T, and S(x)=(S1(x),S2(x),,Sk(x))T, and (Equation6) can be stated as follows: (8) y(x)=TS(x).(8)

Lemma 2.4

[Citation24]

Let f(x) be a continuous function defined on compact set ℘. There exists a positive constant φ that satisfies the following inequality: (9) supx|ϕ(x)TS(x)|φ.(9)

By means of (Equation9), the following fuzzy logic systems are served to approximate the functions ϕi (i=1,2,,n): (10) ϕi(x¯i|ˆi)=ˆiTSi(x¯i),(10) where x¯ˆi represents the approximation of x¯i and ˆi denotes the estimations of i. The ideal weight vectors i can be described as (11) i=argminˆiΩi[supx¯ii|ϕi(x)iTS(x¯i)|],(11) where Ωi is a bounded set.

The control objective of this article is to design a disturbance observer-based fuzzy adaptive optimal finite-time control scheme for nonlinear systems (Equation1), so that (1) all the signals in systems are finite time stable; (2) all the system states are in constrained sets; (3) the output of systems admits to track the reference trajectory.

3. Optimal controller design

In what follows, a fuzzy adaptive optimal finite-time strategy is designed to achieve the control objective by virtue of backstepping process and actor-critic framework, in which the barrier-type function is accustomed to cost functions. Consider the bias function uf in (Equation2) and external disturbance Dn as whole disturbance. The nonlinear systems (Equation1) can be expressed as (12) {x˙i=xi+1+ϕi(x¯i)+di(t),x˙n=u+ϕn(x¯n)+dn(t),y(t)=x1,1in,(12) where di=Di (i=1,2,,n1) and dn=uf+Dn.

Remark 3.1

Extensive practical systems contain perturbation items, which have a negative effect for control quality. Different from the strategies proposed in [Citation40,Citation47], this paper employs disturbance observer to track external disturbances online to mitigate the negative effects on the systems and improve the control performance of the systems. The term uf is bounded and ufF, the external disturbance DnDn, and we deduce dn=Dn+ufDn+F=dn. Thus the total disturbance dn is bounded.

Then the following coordinate transformation is introduced as (13) z1=x1yr,zi=xiαˆi1,2in,(13) where αˆi1 is the optimal virtual control and yr is the desire tracking trajectory.

Step 1: The time derivative of z1 can be yielded from (Equation1) and (Equation13) as (14) z˙1=x˙1y˙r=x2+ϕ1+d1y˙r.(14) Choose the infinite integral barrier-type performance index function that satisfies (Equation1) as (15) J1(z1(0))=0h1(z1(s),α1(z1))ds,(15) where h1(z1(s),α1(z1))=ψ1logk1bk1b2z12+α12 is the cost function where ψ1> 0, α1 is the virtual controller and a compact set Ω1={z1:|z1|<|k1b|}. Let α1 be the optimal virtual control. The optimal performance index function is constructed by the following to achieve the minimum control performance index in (Equation15), (16) J1(z1(t))=0h1(z1(s),α1(z1))ds=min(0h1(z1(s),α1(z1))ds).(16) Taking the time derivative of (Equation16), we acquire (17) J1t=J1z1(x2+ϕ1+d1y˙r).(17) Define HJB equation associating with (Equation17) as (18) H1(z1,α1,J1z1)=ψ1logk1bk1b2z12+α12+J1z1(α1+ϕ1+d1y˙r)=0.(18) By solving the equation H1/α1=0, the optimal virtual controller is obtained as (19) α1=12J1z1.(19) With the aim of realizing finite-time optimal control, J1z1 is decomposed as (20) J1zˆ1=2σ1z12δ1(k1b2z12)δ1+2σ¯1z1+2dˆ1+92z1k1b2z12+2ϕ1+J1o(z1),(20) where σ1 and σ¯1 are designed positive parameters, and f1 is the optimal weight. J1o(z1)=2σ1z12δ1(k1b2z12)δ12σ¯1z12dˆ192z1k1b2z122ϕ1+J1z1. Merging (Equation19) and (Equation20), we verify (21) α1=σ1z12δ1(k1b2z12)δ1σ¯1z1dˆ194z1k1b2z12ϕ112J1o(z1).(21) In (Equation21), J1o and ϕ1 are unknown functions that can be approximated as (22) J1o=J1TSJ1+φJ1,(22) (23) ϕ1=f1TSf1+φf1,(23) where J1T is the ideal weight and SJ1 is the basis vector. Combining (Equation20), (Equation21), (Equation22) and (Equation23), we get (24) α1=σ1z12δ1(k1b2z12)δ1σ¯1z1dˆ112J1TSJ194z1kib2z12f1TSf112φ1,(24) (25) J1z1=2σ1z12δ1(k1b2+2z12)δ1+2σ¯1z1+2dˆ1+92z1k1b2z12+J1TSJ1+φ1+2f1TSf1,(25) where φ1=2φf1+φJ1. It is noteworthy that α1 is inaccessible directly since J1T and f1T are unknown ideal weights. The estimate of the unknown function ϕ1 can be represented as ϕˆ1=ˆf1TSf1. The adaptive law ˆ˙f1 is constructed as (26) ˆ˙f1=z1k12z12Sf1m1ˆf1,(26) where m1> 0. To obtain available α1, the critic-actor structure with the critic and actor adaptive laws is introduced as follows: (27) Jˆ1z1=2σ1z12δ1(k1b2+2z12)δ1+2σ¯1z1+2dˆ1+92z1k1b2z12+ˆc1TSJ1+2ˆf1TSf1,(27) where Jˆ1z1 is the estimation of J1z1. Design the critic updated law as (28) ˆ˙c1=γc1SJ1SJ1Tˆc1,(28) where γc1> 0. The virtual controller consists of actor adaptive law (29) αˆ1=σ1z12δ1(k1b2z12)δ1σ¯1z1dˆ194z1k1b2z1212ˆa1TSJ1ˆf1TSf1.(29) Correspondingly, the actor updated law is designed by (30) ˆ˙a1=SJ1SJ1T[γa1(ˆa1ˆc1)+γc1ˆc1],(30) where γa1> 0. By substituting (Equation29) and (Equation27) into (Equation18), we obtain H1(zˆ1,αˆ1,J1z1)=ψ1logk1bk1b2z12+(σ1z12δ1(k1b2z12)δ1σ¯1z1dˆ194z1kib2z12z12δ1(k1b2z12)δ112ˆa1TSJ1ˆf1TSf1)2+(2σ1z12δ1(k1b2z12)δ1+2σ¯1z1+2dˆ1+92z1kib2z12z12δ1(k1b2z12)δ1+ˆc1TSJ1+2ˆf1TSf1)(σ1z12δ1(k1b2z12)δ1σ¯1z1dˆ112ˆa1TSJ1z12δ1(k1b2z12)δ112ˆTSf194z1kib2z12+ϕ1)+d1yr˙=0.The Bellman residual error e1 is expressed as (31) e1=H1(z1,aˆ1,Jˆ1z1)H1(z1,aˆ1,J1z1)=H1(z1,aˆ1,Jˆ1z1).(31) It is emphasized that the optimal virtual controller αˆ1 is constructed to guarantee H1(z1,αˆ1,J1z1)0. If H1(z1,αˆ1,J1z1)=0 has a unique solution, then one obtains (32) H1(z1,αˆ1,Jˆ1z1)ˆa1=12SJ1SJ1T(ˆa1ˆc1)=0.(32) Construct a positive function (33) E1=(ˆa1ˆc1)T(ˆa1ˆc1)=0.(33) It is obvious that E1=0 which is equal to (Equation32). The actor and critic adaptive laws can be designed in view of the following relation: (34) E1ˆa1=E1ˆ1=2(ˆa1ˆc1).(34) Thus we have (35) E˙1=E1ˆc1ˆ˙c1+E1ˆa1ˆ˙a1=γc1E1ˆc1SJ1SJ1Tˆc1E1ˆa1SJ1SJ1T×[γa1(ˆa1ˆc1)+γc1ˆc1]=γa12(E1ˆa1)2SJ1SJ10.(35) Therefore, (Equation28) and (Equation30) enable (Equation32) to be finally realized. The disturbance observer is designed as (36) dˆ1=f1(x1κ1),κ˙1=x2+dˆ1+ˆf1TSf1,(36) where f1> 0. Define the Lyapunov function as follows: (37) V1=12logk1b2k1b2z12+12~c12+12~a12+12~f12+12d~12,(37) where ~c1=c1ˆc1. In view of (Equation13), (Equation28) and (Equation30), the time derivative of V1 is (38) V˙1=z1k1b2z12z˙1~c1ˆ˙c1~a1ˆ˙a1~f1ˆ˙f1+d~1d~˙1=z1k1b2z12(z2σ1z12δ1(k1b2z12)δ1σ¯1z1dˆ1z12δ1(k1b2z12)δ112ˆa1TSJ1ˆf1TSf194z1k1b2z12y˙r+ϕ1+d1)+~a1SJ1SJ1T(γa1(ˆa1ˆc1)+γc1ˆc1)+~f1(z1k12z12Sf1m1ˆf1)+~c1γc1SJ1SJ1Tˆc1+d~1(d˙1f1(d~1~f1TSf1φf1)).(38) The following correlations hold by utilizing the Young's inequality (39) z1k1b2z12z212z12(k1b2z12)2+12z22,(39) (40) z1k1b2z12φf112z12(k1b2z12)2+12φf12,(40) (41) z1k1b2z12y˙r12z12(k1b2z12)2+12y˙r2,(41) (42) z1k1b2z12d~112z12(k1b2z12)2+12d~12,(42) (43) 12z1k1b2z12ˆa1TSJ114z12(k1b2z12)2+14ˆa1TSJ1SJ1Tˆa1.(43) From (Equation39)–(Equation43), we yield (44) V˙1σ1z12δ(k1b2z12)δσ¯1z12k1b2z12+12d~12+12z22+~a1SJ1SJ1T(γa1(ˆa1ˆc1)+γc1ˆc1)+~f1(z1k12z12Sf1m1ˆf1)+~c1γc1SJ1SJ1Tˆc1+d~1(d˙1f1(d~1~f1Sf1φf1))+14ˆa1TSJ1SJ1Tˆa1.(44) In light of ~c1=c1ˆc1, ~a1=a1ˆa1 and ~f1=f1ˆf1, we get (45) γc1~J1TSJ1SJ1TˆJ1=12γc1[c1TSJ1SJ1Tc1~c1TSJ1SJ1T~c1ˆc1TSJ1SJ1Tˆc1],(45) (46) γa1~a1TSJ1SJ1Tˆa1=12γa1[J1TSJ1SJ1TJ1~a1TSJ1SJ1T~a1ˆa1TSJ1SJ1Tˆa1],(46) (47) (γc1γa1)~a1TSJ1SJ1Tˆc1γc1γa12[~a1TSJ1SJ1T~a1ˆc1TSJ1SJ1Tˆc1].(47) (48) m1ˆf1~f1m12f12m12~f12.(48) Substituting (Equation45)–(Equation48) into (Equation44), we confirm (49) V˙1σ1z12δ(k1b2z12)δσ¯1z12k1b2z12γc12~c1TSJ1SJ1T~c12γa1γc12~a1TSJ1SJ1T~a1m12~f12+M1,112γc1ˆc1TSJ1SJ1Tˆc1(12γa114)ˆa1TSJ1SJ1Tˆa1γc1γa12ˆc1TSJ1SJ1Tˆc1+d~1(d˙1f1(d~1~f1φf1Sf1)),(49) where γa1>γc1/2, γa1>12 and M1,1=(γc1+γa1)2(c1TSJ1SJ1Tc1)+12y˙r2+m1f1+12φf12+12k2b2 by reason of 12z22<12k2b2. Let λminSJ1 and λminSf1 be the minimal eigenvalue of SJ1SJ1T and Sf1Sf1T, respectively. The following inequalities hold: (50) d~1d˙112d~12+12d12,(50) (51) f1d~1φf112f1d~12+12f1φf12,(51) (52) f1d~1~f1Sf112f1d~12+12f1λminSf1~f12,(52) (53) γc12~c1TSJ1SJ1T~c1γc12λminSJ1~c1T~c1,(53) (54) 2γa1γc12~a1TSJ1SJ1T~a12γa1γc12λminSJ1~a1T~a1.(54) Substituting (Equation50)–(Equation54) into (Equation49), we deduce (55) V˙1σ1z12δ(k1b2z12)δσ¯1z12k1b2z12+M12γa1γc12λminSJ1~a12γc12λminSJ1~c1212(m1+f1λminSf1)~f1212(f132)d~12,(55) where f1>32 and M1=M1,1+12d12+12f1φf12.

Step i(2in1): The time derivative zi can be derived from (Equation12) and (Equation13) as (56) z˙i=xi+1+ϕi+diαˆ˙i1.(56) Introduce the performance index function as (57) Ji(zi(0))=0hi(zi(s),αi(zi))ds,(57) where hi(zi(s),αi(zi))=ψilogkibkib2zi2+αi2 represents the cost function and αi denotes the virtual controller. Let αi be the optimal virtual controller. Similar to (Equation16), to achieve the minimum control performance index in (Equation57), the optimal performance index function is established as follows: (58) Ji(zi(t))=0hi(zi(s),αi(zi))ds=min(0hi(zi(s),αi(zi))ds).(58) Taking the time derivative of (Equation58), we deduce (59) Jit=Jizi(xi+1+ϕi+diαˆ˙i1)=Jizi(zi+1+αi+ϕi+diαˆ˙i1).(59) Define HJB equation associating with (Equation59) as follows: (60) Hi(zi,αi,Jizi)=ψ1logkibkib2zi2+αi2+Jiz1(αi+ϕi+diαˆ˙i1)=0.(60) Solve the equation Hi/αi=0, the optimal virtual controller is determined by (61) αi=12Jizi.(61) For the purpose of the optimal control, Ji/zi can be decomposed as (62) Jizi=2σizi2δ1(kib2zi2)δ1+2σ¯izi+2dˆi+92zikib2zi2+2ϕi+Jio(z1),(62) where σi, σ¯i> 0 and Jio(zi)=2σizi2δ1(kib2zi2)δ12σ¯izi2dˆi92zikib2zi22ϕi+Jizi. Thus  (Equation61) can be represented as (63) αi=σizi2δ1(kib2zi2)δ1σ¯izidˆi94zikib2zi2ϕi12Jio(zi).(63) In (Equation63), Jio and ϕi are approximated by fuzzy logic systems as (64) Jio=JiTSJi+φJi,(64) (65) ϕi=fiTSfi+φfi.(65) The following equations are established in light of combining (Equation64) and (Equation65) with (Equation62) and (Equation63) respectively, (66) αi=σizi2δ1(kib2zi2)δ1σ¯izidˆi12J1TSJ194zikib2zi2f1TSf112φi,(66) (67) Jizi=2σizi2δ1(kib2zi2)δ1+2σ¯izi+2dˆi+92zikib2zi2+J1TSJ1+φi+2f1TSf1,(67) where φi=2φfi+φJi. It is noticeable that αi is unavailable because JiT and fiT are the unknown ideal weights. Resembling (Equation27) and (Equation29), the actor-critic structure is developed as (68) Jˆizi=2σizi2δ1(kib2+2zi2)δ1+σ¯izi+2dˆi+92zikib2zi2+ˆciTSJi+2ˆfiTSfi.(68) Design the critic updated law, optimal virtual control law and actor updated law as (69) ˆ˙ci=γciSJiSJiTˆci,(69) (70) αˆi=σizi2δ1(kib2zi2)δ1σ¯izidˆi94zikib2zi212ˆa1TSJ1ˆf1TSf1,(70) (71) ˆ˙ai=SJiSJiT[γai(ˆaiˆci)+γciˆci],(71) where γci and γai are positive numbers. The fuzzy updated law is constructed as (72) ˆ˙fi=zikib2zi2Sfimiˆfi,(72) where mi is a positive constant. By lumping (Equation68), (Equation70) and (Equation72) into (Equation60), we acquire (73) Hi(zi,αˆi,Jizi)=ψilogkibkib2zi2+(σizi2δ1(kib2zi2)δ1σ¯iziσizi2δ1(kib2zi2)δ1dˆi94zikib2zi212ˆaiTSJizikib2zi2ˆfiTSfi)2+(2σizi2δ1(kib2+2zi2)δ1+2σ¯izi+2dˆi+92zikib2zi2σizi2δ1(kib2zi2)δ1zikib2zi2+ˆciTSJi+2ˆfiTSfi)(σizi2δ1(kib2zi2)δ1σ¯iziσizi2δ1(kib2zi2)δ1dˆi12ˆaiTSJi12ˆTSfi+ϕi+diαˆ˙i1)=0.(73) The disturbance observer is devised as (74) dˆi=fi(xiκi),κ˙i=xi+1+dˆi+ˆfiTSfi.(74) Construct the Lyapunov function as (75) Vi=Vi1+12logkib2kib2zi2+12~ci2+12~ai2+12~fi2+12d~i2,(75) where ~ai=aiˆai,~ci=ciˆci and ~fi=fiˆfi. By combining (Equation13), (Equation69) and (Equation71), the time derivative of (Equation75) yields (76) V˙i=V˙i1+zikib2zi2z˙i~ciˆ˙ci~aiˆ˙ai~fiˆ˙fi+d~id~˙i=zikib2zi2(zi+1σizi2δ1(kib2zi2)δ1σ¯izidˆiσizi2δ1(kib2zi2)δ112ˆaiTSJiˆfiTSfi94zikib2zi2αˆ˙i1+ϕi+di)+~aiSJiSJiT(γai(ˆaiˆci)+γciˆci)+~fi(ziki2zi2Sfimiˆfi)+~ciγciSJiSJiTˆci+d~i(d˙ifi(d~i~fiSfiφfi)).(76) The following relationship can be deduced by utilizing the Young's inequality, (77) zikib2zi2zi+112zi2(kib2zi2)2+12zi+12,(77) (78) zikib2zi2φfi12zi2(kib2zi2)2+12φfi2,(78) (79) zikib2zi2αˆ˙i112zi2(kib2zi2)2+12αˆ˙i12,(79) (80) zikib2zi2d~i12zi2(kib2zi2)2+12d~i2,(80) (81) 12zikib2zi2ˆaiTSJi14zi2(kib2zi2)2+14ˆaiTSJiSJiTˆai.(81) Associate (Equation77)–(Equation81) with (Equation76), we can derive (82) V˙iV˙i1σ1zi2δ(kib2zi2)δσ¯izi2kib2zi2+12d~12+12z22+~aiSJiSJiT(γai(ˆaiˆci)+γciˆci)+~fi(ziki2zi2Sfimiˆfi)+~ciγciSJiSJiTˆci+d~i(d˙if1(d~i~f1Sfiφfi))+14ˆaiTSJiSJiTˆai.(82) Due to ~ci=ciˆci, ~ai=aiˆai and ~fi=fiˆfi the following correlations are inferred: (83) γci~ciTSJiSJiTˆci=12γci(JiTSJiSJiTJi~ciTSJiSJiT~ciˆciTSJiSJiTˆci),(83) (84) γai~aiTSJiSJiTˆai=12γai(JiTSJiSJiTJi~aiTSJiSJiT~aiˆaiTSJiSJiTˆai),(84) (85) (γciγai)~aiTSJiSJiTˆci(85) (86) γciγai2(~aiTSJiSJiT~aiˆciTSJiSJiTˆci),(86) (87) miˆfi~fimi2fi2mi2~fi2.(87) By lumping (Equation83)–(Equation87) into (Equation76), we get (88) V˙iV˙i1σizi2δ(kib2zi2)δσ¯izi2kib2zi2γci2~ciTSJiSJiT~ci2γaiγci2~aiTSJiSJiT~aimi2~fi2+Mi,112γciˆciTSJiSJiTˆci(i2γai14)ˆaiTSJiSJiTˆaiγciγai2ˆciTSJiSJiTˆci+d~i(d˙ifi(d~i~fiφfiSfi)),(88) where Mi,1=(γci+γai)2(JiTSJiSJiTJi)+12αˆ˙i12+m12fi+12φfi2+12k(i+1)b2 by virtue of 12zi+12<12k(i+1)b2. Assuming that λminSJi represents the minimal eigenvalue of SJiSJiT, the following inequalities yields: (89) d~id˙i12d~i2+12di2,(89) (90) fid~iφfi12fid~i2+12fiφfi2,(90) (91) fid~i~fiSfi12fid~i2+12fiλminSfi~fi2,(91) (92) γci2~ciTSJiSJiT~ciγci2λminSJi~ciT~ci,(92) (93) 2γaiγci2~aiTSJiSJiT~ai2γaiγai2λminSJi~aiT~ai.(93) Substituting (Equation92) and (Equation93) into (Equation88), we obtain (94) V˙ij=1iσjzj2δ(kjb2zj2)δj=1iσ¯jzj2kjb2zj2+Mij=1i2γajγcj2λminSJj~aj2j=1iγcj2λminSJj~cj2j=1i12(mj+fjλminSfj)~fj212(fj32)d~j2,(94) where Mi=Mi,1+12di2+12fiφfi2.

Step n: zn can be showcased from (Equation12) and (Equation13) as (95) zn=xnαˆn1.(95) The time derivative of (Equation95) is stated by (96) z˙n=x˙nαˆ˙n1=u+ϕn+dnαˆ˙n1.(96) Define the integral performance index function as (97) J(zn)=0hn(zn,u(zn))ds,(97) where hn=ψilogknbknb2zn2+αn2. Let u be the optimal actual controller, then the optimal performance index function is represented by (98) J(zn)=0hn(zn,u(zn))ds=min(0hn(zn,u(zn))ds).(98) Akin to  (Equation20), we have (99) Jnt=Jnzn(u+ϕn+dnαˆ˙n1).(99) By associating with (Equation95), the HJB equation is formalized by (100) Hn(zn,αn,Jnzn)=ψnlogknbknb2zn2+u2+Jnzn(u+ϕn+dnαˆ˙n1)=0.(100) Similar to (Equation61), dealing with the (Hn/u) = 0, we yield (101) u=12Jnzn.(101) Let Jnzn=2σnzn2δ1(knb2zn2)δ1+2σ¯nzn+2dˆn+72znknb2zn2+2ϕn+Jno(zn), where σn> 0. The optimal actual controller can be stated by (102) u=σnzn2δ1(knb2zn2)δ1σ¯nzndˆn74znknb2zn2ϕn12Jno(zn).(102) Jno and ϕn can be approximated by the fuzzy logic system as (103) Jno=JnTSJn+φJn,(103) (104) ϕn=fnTSfn+φfn.(104) Merging (Equation103), (Equation104) and (Equation102), we yield (105) u=σnzn2δ1(knb2zn2)δ1σ¯nzndˆn12JnTSJn74znknb2zn2fnTSfn12φn,(105) where φn=2φfn+φJn Combining with (Equation103), Jnzn can be formulated by (106) Jnzn=2σnzn2δ1(knb2zn2)δ1+2σ¯nzn+2dˆn+72znknb2zn2+JnTSJn+2fnTSfn+φn.(106) The critic for evaluating (Equation106) and critic updated law is conceived as (107) Jˆnzn=2σnzn2δ1(knb2+2zn2)δ1+2σ¯nzn+2dˆn+72znknb2zn2+ˆJnTSJn+2ˆfnTSfn,(107) (108) ˆ˙cn=γcnSJnSJnTˆcn.(108) The law of the actor and the actual controller are constructed as (109) uˆ=σnzn2δ1(knb2zn2)δ1σ¯nzndˆn2znknb2zn212ˆanTSJnˆfnTSfn,(109) (110) ˆ˙an=SJnSJnT[γan(ˆanˆcn)+γcnˆcn].(110) The adaptive law ˙fn is updated as (111) ˆ˙fn=znknb2zn2Sfnmnˆfn,(111) where mn is a positive constant. The HJB equation is derived as (112) Hn(zn,αˆn,Jnzn)=ψnlogknbknb2zn2+(σnzn2δ1(knb2zn2)δ1σ¯nzndˆn74znknb2zn212ˆanTSJnˆfnTSfn)2+(2σnzn2δ1(knb2+2zn2)δ1+2σ¯nzn+2dˆn+72znknb2zn2zn2δ1(knb2+2zn2)δ1+ˆcnTSJn+2ˆfnTSfn)(σnzn2δ1(knb2zn2)δ1σ¯nznzn2δ1(knb2+2zn2)δ1dˆn12ˆanTSJn12ˆfnTSfn+ϕn+dnαˆ˙n1)=0.(112) The disturbance observer is built as (113) dˆn=fn(uκn),κ˙n=u+dˆn+ˆfnTSfn.(113) The Lyapunov function is selected as follows: (114) Vn=Vn1+12logknb2knb2zn2+12~cn2+12~an2+12~fn2+12d~n2.(114) Merging with (Equation108), (Equation110), (Equation113) and (Equation95), the time derivative of (Equation114) is determined by (115) V˙n=V˙n1+znknb2zn2z˙n~cnˆ˙cn~anˆ˙an~fnˆ˙fn+d~nd~˙n=znknb2zn2(σnzn2δ1(knb2zn2)δ1σ¯nzndˆn12ˆanTSJnˆfnTSfn74znknb2zn2αˆ˙n1+ϕn+dn)+~anSJnSJnT(γan(ˆanˆcn)+γcnˆcn)+~fn(znkn2zn2Sfnmnˆfi)+~cnγcnSJnSJnTˆcn+d~n(d˙nfn(d~n~fnSfnφfn)).(115) By Young's inequality, we acquire (116) znknb2zn2φf112zn2(knb2zn2)2+12φfn2,(116) (117) znknb2zn2α˙n112zn2(knb2zn2)2+12α˙i12,(117) (118) znknb2zn2d~n12zn2(knb2zn2)2+12d~n2,(118) (119) 12znknb2zn2ˆanTSJn14zn2(knb2zn2)2+14ˆanTSJnSJnTˆan.(119) Integrate (Equation116)–(Equation119) into (Equation115), we have (120) V˙nV˙n1σnzn2δ(knb2zn2)δσ¯nzn2knb2zn2+12d~n2+~anSJnSJnT(γan(ˆanˆcn)+γcnˆcn)+~fn(znkn2zn2Sfnmnˆfn)+~cnγcnSJnSJnTˆcn+d~n(d˙nfn(d~n~fnSfnφfn))+14ˆanTSJnSJnTˆan.(120) From ~cn=cnˆcn, ~an=anˆan and ~fn=fnˆfn, we get the following relations: (121) γcn~cnTSJnSJnTˆcn=12γcn[JnTSJnSJnTJn~cnTSJnSJnT~cnˆcnTSJnSJnTˆcn],(121) (122) γan~anTSJnSJnTˆan=12γan[JnTSJnSJnTJn~anTSJnSJnT~anˆanTSJnSJnTˆan],(122) (123) (γcnγan)~anTSJnSJnTˆcnγcnγan2[~anTSJnSJnT~anˆcnTSJnSJnTˆcn],(123) (124) mnˆfn~fnmn2fn2mn2~fn2.(124) Invoking (Equation121)–(Equation124) and (Equation94) for (Equation115), we infer (125) V˙nV˙n1σnzn2δ(knb2zn2)δσ¯nzn2knb2zn22γanγcn2~anTSJnSJnT~anmn2~fn2+Mn,112γcnˆcnTSJnSJnTˆcn(n2γan14)ˆanTSJnSJnTˆanγcnγan2ˆcnTSJnSJnTˆcnγcn2~cnTSJnSJnT~cn+d~n(d˙nfn(d~n~fnφfnSfn)),(125) where Mn,1=(γcn+γan)2(cnTSJnSJnTcn)+12αˆ˙n12+m12fn+12φfn2. Suppose that λminSJn represents the minimal eigenvalue of SJnSJnT, the following inequalities yield: (126) d~nd˙n12d~n2+12dn2,(126) (127) fnd~nφfn12fnd~n2+12fnφfn2,(127) (128) fnd~n~fnSfn12fnd~n2+12fnλminSfn~fn2,(128) (129) γcn2~cnTSJnSJnT~cnγcn2λminSJn~cnT~cn,(129) (130) 2γanγcn2~anTSJnSJnT~an2γanγcn2λminSJn~anT~an.(130) Substituting (Equation129), (Equation130) and similar to (Equation94), we obtain the following inequality: (131) V˙nj=1nσjzj2δ(kjb2zj2)δj=1nσ¯jzj2kjb2zj2+Mnj=1n2γajγcj2λminSJj~aj2j=1nγcj2λminSJj~cj2j=1n12(mj+fjλminSfj)~fj2j=1n12(fj32)d~j2,(131) where Mn=Mn,1+12dn2+12fnφfi2.

4. Stability analysis

In this section, the stability of the system is demonstrated.

Theorem 4.1

Consider the nonlinear system (Equation1) with actuator bias fault signal and external disturbances. Suppose that Assumptions 2.1 and 2.2 hold. Taking into account the designed critic adaptive laws as (Equation28), (Equation69) and (Equation108), actor updated laws as (Equation30), (Equation71) and (Equation110), fuzzy adaptive laws (Equation26), (Equation72) and (Equation111), and disturbance observers (Equation36), (Equation74) and (Equation113). The proposed fuzzy adaptive optimal finite-time control scheme ensures that (1) all signals within the closed-loop system are bounded; (2) all states are in their specific intervals.

Proof.

Let V=Vn, on the basis of (Equation131), we have (132) V˙j=1nσjzj2δ(kjb2zj2)δj=1nσ¯jzj2kjb2zj2+Mnj=1n2γajγcj2λminSJj~aj2j=1nγcj2λminSJj~cj2j=1n12(mj+fj)λminSfj~fj2j=1n12(fj32)d~j2.(132) Let Cσ=min{2σ1,2σ2,,2σn}, Cσ¯=min{2σ¯1,2σ¯2,,2σ¯n}, Ca=min{(2γa1γc1)λminSJ1,(2γa2γc2)λminSJ2,,(2γanγcn)λminSJn}, Cc=min{γc1λminSJ1,γc2λminSJ2,,γcnλminSJn}, Cf=min{(m1+f1)λminSf1,(m2+f2)λminSf2,(mn+fn)λminSfn} and Cd=min{f132,f232,,fn32}. From (Equation132), we acquire (133) V˙Cσ12j=1nzj2δ(kjb2zj2)δCσ¯12j=1nzj2kjb2zj2+MnCa12j=1n~aj2Cc12j=1n~cj2Cf12j=1n~fj2Cd12j=1nd~j2.(133) In light of (Equation4), define p1=1,p2=k=1n12~kT~k, r1=1δ, r2=δ, r3=1, one has (134) (j=1n12~fj2)δ1δ+δj=1n12~fj2.(134) In view of the same fashion as (Equation134), it leads to (135) (j=1n12d~j2)δ1δ+δ12d~j2,(135) (136) (j=1n12~aj2)δ1δ+δj=1n12~aj2,(136) (137) (j=1n12~cj2)δ1δ+j=1n12~cj2.(137) It follows from (Equation5) that (138) 12Cσk=1nzj2δ(kjb2zj2)δ2δ1Cσ(k=1n12zj2kjb2zj2)δ.(138) According to (Equation135)–(Equation138), (Equation133) can be rewritten as (139) V˙2δ1Cσ(k=1n12zj2kjb2zj2)δCσ¯12j=1nzj2kjb2zj2Cf(j=1n12~fj2)δCa(j=1n12~aj2)δCc(j=1n12~cj2)δCd(j=1n12~cj2)δCf(1δ)j=1n12~fj2Cc(1δ)j=1n12~cj2Ca(1δ)j=1n12~aj2+(1δ)(Cc+Cd+Cf+Ca)+Mn.(139) Define C1=min{2δ1Cσ,Cf,Ca,Cc,Cd,} and C2=min{Cσ¯,Cf(1δ),Cc(1δ),,Ca(1δ)}. It is worth noting that zi (i=1,2,,n) in constraint sets, we can get j=1n12logkjb2kjb2zj2<j=1n12zj2kjb2zj2, rewrite (Equation139) as (140) V˙C1VδC2V+M,(140) where M=Mn+(1δ)(Cc+Cd+Cf+Ca). In light of (Equation140), we know that V˙C2V+M, therefore, it is easy to obtain V is bounded. The similarity follows that ||~ai||, ||~ci|| and ||~fi|| are bounded. Therefore, ||ˆa1||, ||ˆc1|| and ||ˆf1|| are bounded. From previous analysis, we get |zi|kib, it can be further deducted by Assumption 2.2 that |x1|=|z1|+|yr|k1b+yr=k1c, where yr is the upper bounded of yr. αˆ1 is also bounded and αˆ1α¯1, and we can derive that |x2|=|z2|+αˆ1k2b+α¯1=k2c. In the same way, it is true that xikic, where i=3,,n. Thus all system states are confined within constraint sets.

Setting T=1C2(1δ)lnC2V1δ(x0)+lC1C2(M(1l)C1)1δδ+lC1, where 0<l<1. We can deduce that V(x)(M(1l)C1)1δδ, thus (141) 12logk1b2k1b2z1212z12k1b2z12V(M(1l)C1)1δδ.(141) It follows from (Equation141) that (142) |z1|=|yyr|k1b[1e2(M(1l)C1)1δδ]12,(142) which means that the tracking error remains within the origin of the area after the setting time.

Remark 4.1

The adaptive optimal control schemes described in [Citation26,Citation28,Citation32] ensure that all signals are semi-globally uniformly ultimately bounded with V˙C1V+M, the convergence time may be infinite. The strategy proposed in this paper, the time derivative of V satisfies V˙C1VC2Vδ+M, which means that it can achieve the faster response.

Remark 4.2

It is worth noting that the tracking error is remained within the origin of the area after the setting time by choosing appropriate parameters. Reduce the area of origin by adjusting the value of δ or increasing C1. Therefore, the parameters should be chosen carefully.

5. Simulation example

This section aims to verify the effectiveness of the proposed fuzzy control method with a simulation instance.

Example 5.1

A model with external disturbances and bias fault signal is presented below (143) {x˙1=x2+ϕ1(x1)+D1(t),x˙2=u+uf+ϕ2(x¯2)+D2(t),y(t)=x1,1in,(143) where x¯2=[x1,x2]T, ϕ1=sin(x1)cos(x1),ϕ2=sin(x1+x2),D1=2sin(0.5t),D2=sin(2t), the output reference trajectory is defined as yr=5sin(0.5t), and (144) uf={0,t<T,6sin(0.9t1),tT,(144) where T=10s. The chosen membership functions are referred to (145) μG1(x1)=exp{(x13)28},μG2(x1)=exp{(x11)28},μG3(x1)=exp{(x1)28},μG4(x1)=exp{(x1+1)28},μG5(x1)=exp{(x1+3)28}.(145) Thus the basis function vector Sf1 is denoted as (146) Sf1=[μG1(x1)i=15μGi(x1),,μG5(x1)i=15μGi(x1)]T.(146) Analogously, the basis function vectors Sf2, SJ1 and SJ2 are formulated as Sf2=[j=12μG1(x2)i=15j=12μGj(xj),,j=12μG1(x2)i=15j=12μGj(xj)]T,SJ1=[μG1(z1)i=15μGi(z1),,μG5(z1)i=15μGi(z1)]T,SJ2=[μG1(z2)i=15μGi(z2),,μG5(z2)i=15μGi(z2)]T.The initial values are configured as ˆf1(0)=ˆf2(0)=[0.8,0.8,0.8,0.8,0.8]T, ˆc1(0)=[0.3,0.3,0.3,0.3,0.3]T, ˆc2(0)=[0.8,0.8,0.8,0.8,0.8]T, ˆa1(0)=[0.1,0.1,0.1,0.1,0.1]T, ˆa2(0)=[0.5,0.5,0.5,0.5,0.5]T, and x1(0)=x2(0)=0.

The parameters in adaptive laws, optimal virtual controller and optimal actual controller are designed as m1=m2=2,γc1=γc2=15,γa1=γa2=13,k1c=k2c=5.5,k1b=1,k2b=2,σ1=σ2=30,σ¯1=σ¯2=10 and δ=9991001.

Figure  manifests the trajectories of y, yr and constraint bound k1c, which shows the control performance and the system state staying within the restricted interval. Figure  represents the trajectories of tracking error z1 and k1b, showing that the tracking error is maintained in a small neighbourhood of about 0. Figure  shows the trajectories of x2 and k2c and the state x2 remains within the blue dashed line. Figures  display the curves of ||ˆfi||, ||ˆci|| and ||ˆfi|| where i = 1, 2, and the curves from these figures are decreasing. Figure  showcases the trajectory of optimal actual controller u. From the simulation results, the proposed scheme in this paper can achieve the desired control objective.

Figure 1. The trajectories of y, yr and k1c.

Figure 1. The trajectories of y, yr and k1c.

Figure 2. The trajectories of z1 and k1b.

Figure 2. The trajectories of z1 and k1b.

Figure 3. The trajectories of x2 and k2c.

Figure 3. The trajectories of x2 and k2c.

Figure 4. The curves of ˆc1 and ˆc2.

Figure 4. The curves of ℵˆc1 and ℵˆc2.

Figure 5. The curves of ˆa1 and ˆa2.

Figure 5. The curves of ℵˆa1 and ℵˆa2.

Figure 6. The curves of ˆf1 and ˆf2.

Figure 6. The curves of ℵˆf1 and ℵˆf2.

Figure 7. The trajectory of u.

Figure 7. The trajectory of u.

Example 5.2

Similar to [Citation9] and [Citation48], this paper considers the robotic manipulator system as follows: (147) JS¨+AS˙+MGrsin(S)=u(t),(147) where S and S˙ are the angle and angular velocity of the link, respectively, M is the total mass of the link, J is the rotational inertia of the motor, G is the gravitational acceleration, A is the damping coefficient. Assuming the effect of external disturbance and bias fault on the system is taken into account, (Equation147) is rewritten as (148) JS¨+AS˙+MGrsin(S)=uA(t)+D(t).(148) The parameter selection in (Equation148) is similar to that in [Citation48], that is, J=1,A=2,M=1,G=10,r=1. Let x1=S˙, x2=S¨ thus the system (Equation147) can be rewritten as (149) {x˙1=x1,x˙2=uA10sin(x1)2x2+D(t),(149) where f2(x¯2)=10sin(x1)2x2, uA=u(t)+uf, the reference signal yr=0.2sin(t). The fuzzy logic system we used and bias fault signal are the same as in Example 5.1. The parameters used in the control strategy are designed as m1=m2=2,γc1=6,γa1=4,γc2=15,γa2=13,k1c=k2c=5.5,k1b=1,k2b=2,σ1=σ2=30,σ¯1=σ¯2=10 and δ=9991001. The reference trajectory yr and external disturbance D(t) are defined as yr=0.2sin(t) and D(t)=0.3sin(2t). The initial values are configured as ˆf1(0)=ˆf2(0)=[0.8,0.8,0.8,0.8,0.8]T, ˆc1(0)=[0.2,0.1,0.1,0,0.1]T, ˆc2(0)=[0.1,0,0,0.2,0.1]T, ˆa1(0)=[0.1,0.2,0.1,0,0]T, ˆa2(0)=[0.1,0,0,0.2,0.1]T, and x1(0)=x2(0)=0.

Figures  are the simulation results of the robotic manipulator system. Figure demonstrates the trajectories of y, yr and k1c, showing satisfactory control performance. As shown in Figure , the tracking error z1 represented by the red solid line does not exceed the blue dashed line k1b and remains within a small neighbourhood relating to the origin. The trajectory of x2, which does not cross the blue dashed line k2c, is shown in Figure . Figures  display the curves of ||ˆfi||, ||ˆci|| and ||ˆfi|| where i = 1, 2. Figure shows the trajectory of optimal actual controller u.

Figure 8. The trajectories of y, yr and k1c.

Figure 8. The trajectories of y, yr and k1c.

Figure 9. The trajectories of z1 and k1b.

Figure 9. The trajectories of z1 and k1b.

Figure 10. The trajectories of x2 and k2c.

Figure 10. The trajectories of x2 and k2c.

Figure 11. The curves of ˆc1 and ˆc2.

Figure 11. The curves of ℵˆc1 and ℵˆc2.

Figure 12. The curves of ˆa1 and ˆa2.

Figure 12. The curves of ℵˆa1 and ℵˆa2.

Figure 13. The curves of ˆf1 and ˆf2.

Figure 13. The curves of ℵˆf1 and ℵˆf2.

Figure 14. The trajectory of u.

Figure 14. The trajectory of u.

6. Conclusion

In this article, the issue of adaptive fuzzy optimal finite-time control for uncertain nonlinear systems with bias fault and external disturbances is studied. Consider bias fault term and external disturbance as total disturbance, the disturbance observer is designed to track the total disturbance online, where the total disturbance consists of bias fault term and external disturbance. By combining with backstepping and ADP technologies, an adaptive fuzzy optimal finite-time control approach is proposed. It proves that all signals of closed-loop are finite-time stable, and the all system states in constrained sets. One future research direction is to extend the proposed method to more general systems such as stochastic systems [Citation49], switched nonlinear systems [Citation50,Citation51] and uncertain under-actuated switched nonlinear systems [Citation52]. In addition, it is another research direction to study unknown nonaffine nonlinear fault problems and unmodelled dynamics problems based on fixed time control [Citation53–55].

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgments

The authors thank the reviewers for their constructive comments in improving the quality of this paper. Zhidong Sun wrote the manuscript, Wei Gao and Li Liang revised and made improvements on the manuscript. The authors have worked equally when writing this paper. All authors read and approved the final manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

Some or all data, models, or code generated or used during the study are available from the corresponding author by request.

Additional information

Funding

This work was supported by National Natural Science Foundation of China[12161094].

References

  • Ma ZY, Ma HJ. Adaptive fuzzy backstepping dynamic surface control of strict-feedback fractional-order uncertain nonlinear systems. IEEE Trans Fuzzy Sys. 2020;28:122–133.
  • Liang YJ, Li YX, Che WW, et al. Adaptive fuzzy asymptotic tracking for nonlinear systems with nonstrict-feedback structure. IEEE Trans Cybern. 2021;51:853–861.
  • Ma H, Liang HJ, Zhou Q, et al. Adaptive dynamic surface control design for uncertain nonlinear strict-feedback systems with unknown control direction and disturbances. IEEE Trans Syst Man Cybern Syst. 2019;49:506–515.
  • Wang HQ, Liu XP, Liu KF, et al. Approximation-based adaptive fuzzy tracking control for a class of nonstrict-feedback stochastic nonlinear time-delay systems. IEEE Trans Fuzzy Sys. 2015;23:1746–1760.
  • Lai GY, Liu Z, Chen CLP, et al. Adaptive compensation for infinite number of time-varying actuator failures in fuzzy tracking control of uncertain nonlinear systems. IEEE Trans Fuzzy Sys. 2018;26:474–486.
  • Zou AM, Hou ZG, Tan M. Adaptive control of a class of nonlinear pure-feedback systems using fuzzy backstepping approach. IEEE Trans Fuzzy Sys. 2008;16:886–897.
  • Zhang HC, Song AG, Li HJ, et al. Novel adaptive finite-time control of teleoperation system with time-varying delays and input saturation. IEEE Trans Cybern. 2021;51:3724–3737.
  • Bhat SP, Bernstein DS. Continuous finite-time stabilization of the translational and rotational double integrators. IEEE Trans Automat Contr. 1998;43:678–682.
  • Qiu JB, Wang T, Sun KK, et al. Disturbance observer-based adaptive fuzzy control for strict-feedback nonlinear systems with finite-time prescribed performance. IEEE Trans Fuzzy Sys. 2022;30:1175–1184.
  • Li Y, Qu F, Tong S. Observer-based fuzzy adaptive finite-time containment control of nonlinear multiagent systems with input delay. IEEE Trans Cybern. 2021;51:126–137.
  • Zhang HG, Liu Y, Wang YC. Observer-based finite-time adaptive fuzzy control for nontriangular nonlinear systems with full-state constraints. IEEE Trans Cybern. 2021;51:1110–1120.
  • Meng B, Liu WH, Qi XJ. Disturbance and state observer-based adaptive finite-time control for quantized nonlinear systems with unknown control directions. J Franklin Inst. 2022;359:2906–2931.
  • Sui S, Chen CLP, Tong SC. Event-trigger-based finite-time fuzzy adaptive control for stochastic nonlinear system with unmodeled dynamics. IEEE Trans Fuzzy Sys. 2021;29:1914–1926.
  • Saravanakumar R, Stojanovic SB, Radosavljevic DD, et al. Finite-time passivity-based stability criteria for delayed discrete-time neural networks via new weighted summation inequalities. IEEE Trans Neural Netw Learn Syst. 2019;30:58–71.
  • Xia JW, Zhang J, Sun W, et al. Finite-time adaptive fuzzy control for nonlinear systems with full state constraints. IEEE Trans Syst Man Cybern. 2019;49:1541–1548.
  • Wang YD, Zong GD, Yang D, et al. Finite-time adaptive tracking control for a class of nonstrict feedback nonlinear systems with full state constraints. Int J Robust Nonlinear Control. 2022;32:2551–2569.
  • Zhao L, Liu GQ, Yu JP. Finite-time adaptive fuzzy tracking control for a class of nonlinear systems with full-state constraints. IEEE Trans Fuzzy Sys. 2021;29:2246–2255.
  • Nguyen VT, Lin CY, Su SF, Sun W. Finite-time adaptive fuzzy tracking control design for parallel manipulators with unbounded uncertainties. Int J Fuzzy Syst. 2019;21:545–555.
  • Wang LB, Wang HQ, Liu XP. Adaptive fuzzy finite-time control of stochastic nonlinear systems with actuator faults. Nonlinear Dyn. 2021;104:523–536.
  • Bellman RE, Corporation R. Dynamic programming. Princeton (NJ): Princeton University Press; 1957.
  • Werbos PJ. Approximate dynamic programming for real-Time control and neural modeling. Handb Intell Control Neural Fuzzy & Adapt Approaches. 1992;15:493–525.
  • Abu-Khalaf M, Lewis FL. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica. 2005;41(5):779–791.
  • Vamvoudakis KG, Lewis FL. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica. 2010;46(5):878–888.
  • Wen GX, Chen CLP, Feng J, et al. Optimized multi-agent formation control based on an identifier–actor–critic reinforcement learning algorithm. IEEE Trans Fuzzy Sys. 2018;26:2719–2731.
  • Li KW, Li YM. Fuzzy adaptive optimal consensus fault-tolerant control for stochastic nonlinear multiagent systems. IEEE Trans Fuzzy Sys. 2022;30:2870–2885.
  • Wen GX, Li B, Niu B. Optimized backstepping control using reinforcement learning of observer-critic-actor architecture based on fuzzy system for a class of nonlinear strict-feedback systems. IEEE Trans Fuzzy Sys. 2022;30:4322–4335.
  • Lan J, Liu YJ, Yu DX, et al. Time-varying optimal formation control for second-order multiagent systems based on neural network observer and reinforcement learning. IEEE Trans Neural Netw Learn Syst. 2022. DOI:10.1109/TNNLS.2022.3158085
  • Li YM, Liu YJ, Tong SC. Observer-based neuro-adaptive optimized control of strict-feedback nonlinear systems with state constraints. IEEE Trans Neural Netw Learn Syst. 2022;33:3131–3145.
  • Bhasin S, Kamalapurkar R, Johnson M, et al. A novel actor–critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica. 2013;49:82–92.
  • Vamvoudakis KG, Miranda MF, Hespanha JP. Asymptotically stable adaptive-optimal control algorithm with saturating actuators and relaxed persistence of excitation. IEEE Trans Neural Netw Learn Syst. 2016;27:2386–2398.
  • Zargarzadeh H, Dierks T, Jagannathan S. Optimal control of nonlinear continuous-time systems in strict-feedback form. IEEE Trans Neural Netw Learn Syst. 2015;26:2535–2549.
  • Wen GX, Chen CLP, Ge SS. Simplified optimized backstepping control for a class of nonlinear strict-feedback systems with unknown dynamic functions. IEEE Trans Cybern. 2021;51:4567–4580.
  • Zerari N, Chemachema M. Event-triggered adaptive output-feedback neural-networks control for saturated strict-feedback nonlinear systems in the presence of external disturbance. Nonlinear Dyn. 2021;104:1343–1362.
  • Li HY, Zhao SY, He W, et al. Adaptive finite-time tracking control of full state constrained nonlinear systems with dead-zone. Automatica. 2019;100:99–107.
  • Liu YC, Zhu QD, Wen GX. Adaptive tracking control for perturbed strict-feedback nonlinear systems based on optimized backstepping technique. IEEE Trans Neural Netw Learn. 2022;33:853–865.
  • Sariyildiz E, Ohnishi K. On the explicit robust force control via disturbance observer. IEEE Trans Ind Electron. 2015;62:1581–1589.
  • Tong SC, Li YM, Liu YJ. Observer-based adaptive neural networks control for large-scale interconnected systems with nonconstant control gains. IEEE Trans Neural Netw Learn Syst. 2021;32:1575–1585.
  • Sun HB, Guo L. Neural network-based DOBC for a class of nonlinear systems with unmatched disturbances. IEEE Trans Neural Netw Learn Syst. 2017;28:482–489.
  • Ji WY, Pan YN, Zhao M. Adaptive fault-tolerant optimized formation control for perturbed nonlinear multiagent systems. Int J Robust Nonlinear Control. 2022;32:3386–3407.
  • Liu M, Ho DWC, Shi P. Adaptive fault-tolerant compensation control for Markovian jump systems with mismatched external disturbance. Automatica. 2015;58:5–14.
  • Xu B, Shou YX, Luo J, et al. Neural learning control of strict-feedback systems using disturbance observer. IEEE Trans Neural Netw Learn. 2018;30:1296–1307.
  • Song RZ, Lewis FL. Robust optimal control for a class of nonlinear systems with unknown disturbances based on disturbance observer and policy iteration. Neurocomputing. 2020;390:185–195.
  • Zerari N, Chemachema M. Robust adaptive neural network prescribed performance control for uncertain CSTR system with input nonlinearities and external disturbance. Neural Comput Appl. 2020;32:10541–10554.
  • Ran MP, Li JC, Xie LH. Reinforcement-learning-based disturbance rejection control for uncertain nonlinear systems. IEEE Trans Cybern. 2022;52:9621–9633.
  • Chen M, Ge SS. Adaptive neural output feedback control of uncertain nonlinear systems with unknown hysteresis using disturbance observer. IEEE Trans Ind Electron. 2015;62:7706–7716.
  • Li KW, Li YM. Adaptive NN optimal consensus fault-tolerant control for stochastic nonlinear multiagent systems. IEEE Trans Neural Netw Learn. 2023. 34(2):947–957. doi:10.1109/TNNLS.2021.3104839.
  • Li HY, Wu Y, Chen M. Adaptive fault-tolerant tracking control for discrete-time multiagent systems via reinforcement learning algorithm. IEEE Trans Cybern. 2021;51:1163–1174.
  • Xing LT, Wen CY, Liu ZT, et al. Adaptive compensation for actuator failures with event-triggered input. Automatica. 2017;85:129–136.
  • Li YL, Liu B, Zong GD, et al. Command filter-based adaptive neural finite-time control for stochastic nonlinear systems with time-varying full state constraints and asymmetric input saturation. Int J Syst Sci. 2022;53:199–221.
  • Zhang HY, Wang HQ, Niu B, et al. Sliding-mode surface-based adaptive actor-critic optimal control for switched nonlinear systems with average dwell time. Inf Sci. 2021;580:756–774.
  • Wang HQ, Tong M, Zhao XD, et al. Predefined-time adaptive neural tracking control of switched nonlinear systems. IEEE Trans Cybern. 2022. DOI:10.1109/TCYB.2022.3204275
  • Zhang HY, Zhao XD, Zhang L, et al. Observer-based adaptive fuzzy hierarchical sliding mode control of uncertain under-actuated switched nonlinear systems with input quantization. Int J Robust Nonlinear Control. 2022;32:8163–8185.
  • Li YM, Sun KK, Tong SC. Observer-based adaptive fuzzy fault-tolerant optimal control for SISO nonlinear systems. IEEE Trans Cybern. 2019;49:649–661.
  • Wang HQ, Xu K, Zhang HG. Adaptive finite-time tracking control of nonlinear systems with dynamics uncertainties. IEEE Trans Automat Contr. 2022. DOI:10.1109/TAC.2022.3226703
  • Ma JW, Wang HQ, Qiao JF. Adaptive neural fixed-time tracking control for high-order nonlinear systems. IEEE Trans Neural Netw Learn. 2022. DOI:10.1109/TNNLS.2022.3176625