2,787
Views
70
CrossRef citations to date
0
Altmetric
Regular papers

High-order fully actuated system approaches: Part VIII. Optimal control with application in spacecraft attitude stabilisation

ORCID Icon
Pages 54-73 | Received 24 Jan 2021, Accepted 28 May 2021, Published online: 19 Jun 2021

Abstract

In this paper, the optimal control problem for dynamical systems represented by general high-order fully actuated (HOFA) models is formulated. The problem aims to minimise an objective in the quadratic form of the states and their derivatives of certain orders. The designed controller is a combination of the linearising nonlinear controller and an optimal quadratic controller for a converted linear system. In the infinite-time output regulation case, the solution is in essence a nonlinear state feedback dependent on a well-known Riccati algebraic equation. In the sub-fully actuated system case, the feasibility of the controller is investigated and guaranteed by properly characterising a ball restriction area of the system initial values. Application of the optimal control technique for sub-fully actuated systems to a spacecraft attitude control provides very smooth and steady responses and well demonstrates the effect and simplicity of the proposed approach.

This article is part of the following collections:
High-order fully actuated (HOFA) system approaches

1. Introduction

Optimal control was born in the late 1950s with the appearance of linear quadratic regulator (LQR), dynamic programming and maximum principle. Kalman (Citation1960) first introduced the LQR for linear systems. The method determines an optimal control in an analytic feedback form and plays a fundamental role in linear control systems theory (Anderson & Moore, Citation1990). Afterwards, fruitful results appear in the literature. Constrained LQR is proposed for linear systems subject to constraints on the inputs and states (Chmielewski & Manousiouthakis, Citation1996; Scokaert & Rawlings, Citation1998), stochastic systems (Chen et al., Citation1998) and descriptor systems (Bender & Laub, Citation1987aCitation1987b) are also studied, with corresponding LQRs developed, and Terra et al. (Citation2014) considers robust linear quadratic regulators (RLQRs) for discrete-time linear systems with parametric uncertainties, while the noise in state and control is taken into account in Rami et al. (Citation2002). Also, Agrawal (Citation2006) and Li and Chen (Citation2008) extend the optimal control theory to fractional dynamic systems, and the solution to fractional LQR problems is given. In addition, Li and Todorov (Citation2004) present iterative LQR design for nonlinear biological movement systems. In the context of switched and hybrid systems, a number of important properties associated with the discrete-time switched LQR (DSLQR) problem are derived (Zhang et al., Citation2009Citation2012).

Dynamic programming was first proposed by Bellman (Citation1952) with the intension to solve multistage decision process problems. This approach transforms the discrete-time optimal control problem into solving the Hamilton–Jacobi–Bellman (HJB) equation (Bellman & Kalaba, Citation1957Citation1960). For problems involving conflicting objectives, major developments of multi-objective dynamic optimisation have been made, e.g. the fuzzy dynamic programming proposed in Abo-Sinna (Citation2004). To break the ‘curse-of-dimensionality’ associated with the traditional dynamic programming approach, approximate dynamic programming is developed by combining simulation and function approximation (Wang et al., Citation2009), where heuristic dynamic programming (HDP) (Werbos, Citation1977), action-dependent HDP (Werbos, Citation1989), neuro-dynamic programming (Bertsekas & Tsitsiklis, Citation1995) and learning-based algorithms (Lewis & Vrabie, Citation2009; Tsitsiklis & Roy, Citation1997) can be used.

Right around the time Bellman proposed DP, Pontryagin's maximum principle appeared for general continuous-time nonlinear control problem by finding the solution of boundary conditions of the HJB equation (Pontiyagin et al., Citation1962; Rozonoer, Citation1959). Later, a discrete version was presented in Hwang and Fan (Citation1967). Nowadays, these types of optimal control techniques have been widely generalised, such as extensions to infinite-dimensional state space (Seierstad, Citation1975), hybrid control systems with Hybrid Maximum Principle (Dmitruk & Kaganovich, Citation2008; Sussmann, Citation1999), constrained impulsive control problems with necessary conditions in the form of Pontryagin's maximum principle (Arutyunov et al., Citation2012), Boolean control networks (Laschov & Margaliot, Citation2011aCitation2011b) and optimal control problems with time delays (Bokov, Citation2011).

As a type of optimal control, model predictive control (MPC) techniques provide a methodology to tackle model uncertainty and dynamical constraints on states and inputs without expert intervention (Findeisen et al., Citation2003; Garcia et al., Citation1989). The idea originally comes from dynamic matrix control (DMC), and then an explosion of activity in MPC has been witnessed in the current century. For deterministic systems, hybrid MPC, economic MPC, explicit MPC and distributed MPC have attracted much attention. On the other hand, robust MPC and stochastic MPC are proposed to deal with control of uncertain systems subject to disturbances, or state and control constraints. More details can be found in Lee (Citation2011) and Mayne (Citation2014) and the references therein.

Besides the above, H optimal control is definitely a celebrated breakthrough, which is developed to deal with the worst-case control design for systems subject to input disturbances (Ball & Helton, Citation1990; Doyle et al., Citation1988). The case that parameter uncertainty appears in plant modelling motivates the research on robust H control problem, and the results involve both continuous and discrete systems (Xie & de Souza, Citation1990; Xu et al., Citation2000). The robust H control theory is also extended to time-delay systems (Xu & Chen, Citation2002). In addition, as an effective approximation method, state-dependent Riccati equation (SDRE)-based methods address nonlinear optimal control problems and provide systematic and effective design of feedback controllers (Çimen, Citation2008Citation2012), where constrained SDRE for systems with state constraints and SDRE in combination with H, sliding mode control, neural network, etc. are also studied (Nekoo, Citation2019).

1.1. Fully actuated system approaches

Most the approaches mentioned above are in the general state–space framework. Except those optimal control techniques which are coping with linear systems, e.g. the linear quadratic optimal control, the linear H2/H optimal control, most of the reported optimal control approaches are, more or less, complicated in theory and are hence difficult to apply in practice due to some theoretical or practical obstacles, such as solutions to complicated nonlinear differential or algebraic equations which are not solvable in an accurate sense.

As argued in Duan (Citation2020a), Duan (Citation2020b), Duan (Citation2020d) and Duan (Citation2020j), high-order fully actuated (HOFA) models also serve as a general model for dynamical control systems, and are especially convenient and effective in dealing with control problems. The demonstration of this with robust control, adaptive control and disturbance rejection control has been given in Duan (Citation2020f), Duan (Citation2020g), Duan (Citation2020h) and Duan (Citation2020i). A huge advantage of the HOFA models is that their full-actuation feature allows one to cancel the known nonlinearities in the system and hence to convert, to an extent, a nonlinear problem into a linear one. Such an advantage is again certified in this paper with the problem of nonlinear optimal control.

Any optimal control problem generally has an index, while the most basic form of an index is the quadratic one. In this paper, an optimal control problem for general dynamical systems described by HOFA models is formulated to minimise an index in the quadratic form of the state and its derivatives of certain orders. As a consequence of this requirement, the system responses eventually behave very smoothly and steadily.

As mentioned above, utilising the full-actuation feature of the HOFA models, the nonlinear optimal control problem is converted essentially to a linear quadratic optimal control problem, and a nonlinear optimal controller in a state feedback form is then obtained. Following such an outline, the finite-time optimal tracking control problem is first treated, and then the infinite-time output regulation problem is also solved.

Another contribution of the paper is the treatment of the sub-fully actuated system case. Sub-fully actuated systems are defined in Duan (Citation2020a) and Duan (Citation2020j), and are relatively more difficult to handle due to a problem of feasibility. Using a feature of linear quadratic optimal control, a feasibility condition is established for the optimal control of sub-fully actuated systems. It turns out that feasibility is guaranteed when a condition on only the initial values of the system is met.

For demonstration of the proposed nonlinear optimal control approach, control of the attitude system of a spacecraft is considered. It is well known that the spacecraft attitude system is highly nonlinear when the attitude angles are working in a wide range. Hence accurate attitude manoeuvring with smooth and steady transient performance has ever remained a big challenge. With the new design, stabilisation of the attitude with very smooth and steady transient performance is achieved, and very efficient rapid turning elimination is also realised. The design and simulation results have fully demonstrated that the proposed approach serves a very effective and simple way to tackle such spacecraft attitude control problems.

In the sequential sections, In denotes the identity matrix, denotes the null set and ΩΘ represents the complement of the set Θ in set Ω. For a square matrix P, λmax(P),λmin(P) and det(P) denote its maximum and minimum eigenvalues and its determinant, respectively, while for a nonsingular matrix P, its condition number is denoted by ν(P)=PP1. Furthermore, for x,xiRm, and AiRm×m, n0,niN,n0<ni, i=1,2,,n, as in the former papers in the series, blockdiag(Ai,i=1,2,,n) denotes a block diagonal matrix with the i-th diagonal block being Ai, and xP=xTPx,x=xI=xTx,x(n1n2)=[x(n1)x(n1+1)x(n2)],n1n2,xij(n1n2)=[xi(n1n2)xi+1(n1n2)xj(n1n2)],ij,n1n2,xk(nk)|k=ij=[xi(ni)xi+1(ni+1)xj(nj)],ij,xk(n0nk)|k=ij=[xi(n0ni)xi+1(n0ni+1)xj(n0nj)],ij,A0n1=[A0A1An1],Φ(A0n1)=[0IIA0A1An1].The paper is organised into seven sections. The next section formulates the nonlinear optimal control problem to be solved in the paper, and a solution to the problem is then given in Section 3. In Sections 4 and 5, the cases of output regulation and sub-fully actuated systems are treated, respectively. An application of the proposed optimal control technique to a spacecraft attitude control problem is presented in Section 6, followed by a brief concluding remark in Section 7. The appendix gives the general attitude model of a spacecraft.

2. Problem formulation

2.1. The HOFA model

Consider the following general HOFA system with multiple orders proposed by Duan (Citation2020j): (1) [x1(μ1)x2(μ2)xη(μη)]=[f1(xk(0μk1)|k=1η,ζ,t)f2(xk(0μk1)|k=1η,ζ,t)fη(xk(0μk1)|k=1η,ζ,t)]+B()u,(1) where uRr is the control vector, ζRp may represent a parameter vector, an external variable vector, a time-delayed state vector, an unmodelled dynamic state vector, etc.; μk,k=1,2,,η, are a set of distinct integers, xkRrk,k=1,2,,η, are a set of vectors of proper dimensions, with rk,k=1,2,,η, being a set of integers satisfying (2) r1+r2++rη=r.(2) Further, fk()Rrk,k=1,2,,η, are a set of nonlinear vector functions, and B()=B(xk(0μk1)|k=1η,ζ,t)Rr×ris a sufficiently smooth matrix function satisfying the following full-actuation condition:

Assumption A1

detB(xk(0μk1)|k=1η,ζ,t)0, or, for all xk(0μk1)|k=1η,ζ, and t>0.

The system (Equation1) satisfying the above Assumption A1 is called a (globally) fully actuated system (Duan, Citation2020j).

Remark 2.1

The above HOFA model (Equation3) might be easily mistaken to represent a very small portion of systems due to the full-actuation Assumption A1. While as discussed in Duan (Citation2020a), Duan (Citation2020d) and Duan (Citation2020j), it serves as a general model for dynamical control systems. Many systems which are in state-space forms can be converted into HOFA systems (see Duan (Citation2020a), Duan (Citation2020b), Duan (Citation2020d) and Duan (Citation2020e)), and practical systems can also be modelled as HOFA systems.

If we denote f(xk(0μk1)|k=1η,ζ,t)=[f1(xk(0μk1)|k=1η,ζ,t)f2(xk(0μk1)|k=1η,ζ,t)fη(xk(0μk1)|k=1η,ζ,t)],then the HOFA system (Equation1) can be compactly written as (3) xk(μk)|k=1η=f(xk(0μk1)|k=1η,ζ,t)+B(xk(0μk1)|k=1η,ζ,t)u.(3) Recall that xkRrk, k=1,2,,η, we have xk(0μk1)Rμkrk,k=1,2,,η.Denote (4) ϰ=k=1ηrkμk,(4)

then it is easy to see that xk(0μk1)|k=1ηRϰ.For the above system (Equation3), we can impose the output equation (5) y=Cxk(0μk1)|k=1η,(5) where CRm×ϰ is a known matrix.

Particularly, in the case that the above system (Equation3) has the following set of output equations (6) yk=Ckxk(0μk1),k=1,2,,η,(6) where Ck,k=1,2,,η, are a set of constant matrices of appropriate dimensions, we can define y=[y1y2yη],C=blockdiag(Ck,k=1,2,,η),which obeys the output Equation (Equation5).

2.2. Statement of the problem

The design objective is to let the output y track a properly given sufficiently differentiable signal z(t)Rm. To realise this, we introduce the following objective: (7) J1=120tf(yzQ2+xk(μk)|k=1ηR2)dt+12y(tf)z(tf)S2,(7) where Q,SRm×m are two semi-positive definite matrices, while RRr×r is a positive definite one. Obviously, the term xk(μk)|k=1ηR2 in the integral of the above index J1 aims to minimise the changing rate of the system states. As a consequence, smoothness and steadiness in the state vector xk(0μk1)|k=1η will be achieved.

Based on the above description, the problem to be solved in this paper can now be stated as follows.

Problem 2.1

Given the HOFA system (Equation3) with the output Equation (Equation5), two semi-positive definite matrices Q,SRm×m, and a positive definite matrix RRr×r, find a feedback controller in the following form: (8) u=B1(xk(0μk1)|k=1η,ζ,t)×[f(xk(0μk1)|k=1η,ζ,t)v],(8)

such that the index J1 given by (Equation7) is minimised, where in (Equation8) (9) v=v(xk(0μk1)|k=1η,z,ζ,t)(9) is some function which is continuous with respect to its variables.

To end this section, let us finally make some remarks about the considered HOFA model (Equation1), or equivalently, (Equation3).

Remark 2.2

Please note that a special case of the above general HOFA model (Equation1) is clearly the following HOFA model with a single-order: (10) x(n)=f(x(0n1),ζ,t)+B(x(0n1),ζ,t)u,(10) which forms the basic part of the system models involved in the problems of robust control, adaptive control and disturbance rejection treated in Duan (Citation2020f), Duan (Citation2020g), Duan (Citation2020h) and Duan (Citation2020i).

Remark 2.3

Most physical systems are governed by certain physical laws, such as the Newton's Law, the Lagrangian Equation, the Theorem of Linear and Angular Momentum, Kirchhoff's Laws of Current and Voltage. When such physical laws are used in modelling, a series of second-order subsystems are originally obtained (Duan, Citation2020a). From this stage we can further get a state–space model by variable extension, and on the other side, we can get a HOFA model by variable elimination if the system is controllable (Duan, Citation2020j). It should be noted that many physically fully actuated systems exist, which are already in the form of second-order HOFA systems at the modelling stage.

3. Solution to problem

The solution to Problem 2.1 can be derived in three steps.

3.1. Step I. Deriving the linear system

In this step, we choose for the system (Equation3) a control input transformation in the form of (Equation8), with v=[v1v2vη]being an introduced control vector. Under this control transformation, the system (Equation3) is turned into the following series of linear systems: (11) xk(μk)=vk,k=1,2,,η,(11) which can be equivalently written in the state-space form as (12) x˙k(0μk1)=Φk(00μk1)xk(0μk1)+Bkcvk,k=1,2,,η,(12) where, by our notations, (13a) Φk(00μk1)=[0Irk000Irk000],Bkc=[00Irk].(13a) Define AE=blockdiag(Φk(00μk1),k=1,2,,η),BE=blockdiag(Bkc,k=1,2,,η),then the set of systems in (Equation12) can be more compactly written as (14) x˙k(0μk1)|k=1η=AExk(0μk1)|k=1η+BEv.(14)

3.2. Step II. Index conversion

In this step, let us consider the index J1 defined by (Equation7).

In view of (Equation3), we have (15) xk(μk)|k=1ηR2=f(xk(0μk1)|k=1η,ζ,t)+B(xk(0μk1)|k=1η,ζ,t)uR2.(15) Further using (Equation8), we obtain (16) v=f(xk(0μk1)|k=1η,ζ,t)+B(xk(0μk1)|k=1η,ζ,t)u.(16) Substituting the above relation (Equation16) into (Equation15) gives (17) xk(μk)|k=1ηR2=vR2.(17) Finally, substituting (Equation17) into (Equation7), turns the index J1 into the following form: (18) J1=120tf(yzQ2+vR2)dt+12y(tf)z(tf)S2.(18)

3.3. Step III. Solving the linear optimal problem

In this step, we design a state feedback control law for the linear system (Equation14), with the output Equation (Equation5), to minimise the index J1 defined in (Equation18). According to the well-known optimal control result for linear systems (see, e.g. Theorem 8.5.1 in Duan (Citation2016)), the controller is readily obtained as (19) v=R1BETP(t)xk(0μk1)|k=1η+R1BETg(t),(19) where P(t)Rϰ×ϰ is the solution to the following differential Riccati equation: (20) P˙=PAE+AETPPBER1BETP+CTQC,(20) with the final value condition (21) P(tf)=CTSC,(21) and g(t)Rϰ is a vector function satisfying the differential equation (22) g˙(t)=Ag(t)g(t)CTQz(t),(22) with (23) Ag(t)=P(t)BER1BETAET,(23) and the final value condition (24) g(tf)=CTSz(tf).(24) To sum up, we have the following theorem about the solution to Problem 2.1.

Theorem 3.1

Let Assumption A1 be met. Then a solution to Problem 2.1 is given by (25) {u=B1(xk(0μk1)|k=1η,ζ,t)×[f(xk(0μk1)|k=1η,ζ,t)v]v=R1BETP(t)xk(0μk1)|k=1η+R1BETg(t),(25) where P(t)Rϰ×ϰ is the solution to the differential Riccati Equation (Equation20) subject to the final value condition (Equation21), and g(t)Rϰ is a vector function satisfying the differential Equations (Equation22)–(Equation23) and the final value condition (Equation24).

In the case of output regulation, that is, the case of z(t)=0, the index J1 becomes (26) J2=120tf(yQ2+xk(μk)|k=1ηR2)dt+12y(tf)S2.(26) In this special case, correspondingly the function g(t) vanishes, and obviously the above Theorem 3.1 becomes the following result.

Corollary 3.1

Let Assumption A1 be met, and Q,SRm×m be semi-positive definite, and RRr×r be positive definite. Then, for the system (Equation3) with the output Equation (Equation5), a feedback controller in the form of (Equation8), which minimises the index J2, is given by (27) {u=B1(xk(0μk1)|k=1η,ζ,t)×[f(xk(0μk1)|k=1η,ζ,t)v]v=R1BETP(t)xk(0μk1)|k=1η(27) where P(t)Rϰ×ϰ is the solution to the differential Riccati Equation (Equation20) subject to the final value condition (Equation21).

To end this section, let us make the following remark.

Remark 3.1

In linear systems theory, time-varying linear system context is a very important part. Different from constant linear ones, time-varying linear systems are usually much more difficult to handle. As a matter of fact, the well-known linear quadratic optimal control problem is generally formulated with time-varying linear systems. We point out that, with HOFA approaches, problems related to time-varying linear systems generally do not present since the time-varying terms in the system may be included in the nonlinear term of the HOFA system and hence cancelled by using the full-actuation property.

4. The infinite-time case

4.1. The problem

Let us look into the case of infinite-time output regulation, that is, the case of z(t)=0 and tf=. Now the final-time term in the objective J1 given in (Equation7) vanishes, and the objective J1 turns to be (28) J3=120(yQ2+xk(μk)|k=1ηR2)dt,(28) where QRm×m is a semi-positive definite matrix, while RRr×r is a positive definite matrix.

Particularly, when the problem of infinite-time state regulation problem is considered, that is, when C=Iϰ or y=xk(0μk1)|k=1η, the above index can be simply written as (29) J4=120(xk(0μk)|k=1η)T[Q00R]×xk(0μk)|k=1ηdt,(29) bearing in mind that the matrix R is required to be symmetric positive definite.

Corresponding to Problem 2.1, now the infinite-time optimal output regulation problem can be stated as follows.

Problem 4.1

Given the HOFA system (Equation3) with the output Equation (Equation5), a semi-positive definite matrix QRm×m and a positive definite matrix RRr×r, find for the system a feedback controller in the form of (Equation8), with (30) v=v(xk(0μk1)|k=1η,ζ,t),(30) such that the index J3 given by (Equation28) is minimised.

Remark 4.1

For the infinite-time optimal control problem, considering tracking an arbitrary signal z(t), as in the finite-time optimal control problem formulated in Section 3, is generally not realisable. However, for certain signal z(t) generated by a proper dynamical system, infinite-time tracking problem can indeed be well formulated and solved (see, e.g. Anderson and Moore (Citation1990)).

4.2. The solution

To derive the solution to the above Problem 4.1, the following well-known result is needed (see, e.g. Lemma 8.3.1 and Theorem 8.3.2 in Duan (Citation2016)).

Lemma 4.2

Let ARn×n, BRn×r, and [A,B] be controllable. Further let R>0, and Q=DTD, [A,D] be observable. Then

(1)

the particular solution P(t) to the Riccati differential equation (31) P˙=PA+ATPPBR1BTP+Q,(31) satisfying the final value condition P(tf)=P converges uniquely to a constant matrix P, that is, limtfP(t)=P;

(2)

this limit matrix P is the unique positive definite solution to the following Riccati algebraic equation ATP+PAPBR1BTP+Q=0;

(3)

the following matrix Ac=ABR1BTPis Hurwitz.

Due to the above lemma, we introduce the following assumption:

Assumption A2

The matrix pair [AE,DC] is observable, where D=Q12.

Since [Φk(00μk1),Bkc],k=1,2,,η, are all controllable, it is easy to know that [AE,BE] is also controllable. Thus it follows from the above Lemma 4.2 and Assumption A1 that the particular solution P(t) to the Riccati differential Equation (Equation20) satisfying the final condition P(tf)=P converges, as tf, to the unique solution of the following Riccati algebraic equation: (32) AETP+PAEPBER1BETP+CTQC=0.(32) According to optimal control theory, for the linear system (Equation14) the optimal controller which minimise the index (33) J3=120(yQ2+vR2)dt(33) is given by v=R1BETPxk(0μk1)|k=1η,and the corresponding closed-loop system is (34) x˙k(0μk1)|k=1η=(AEBER1BETP)×xk(0μk1)|k=1η,(34) which is asymptotically stable according to the above Lemma 4.2.

Based on the above analysis, it is easy to derive the following theorem about the solution to Problem 4.1.

Theorem 4.3

Let Assumptions A1 and A2 be met, and QRm×m be semi-positive definite, and RRr×r be positive definite. Then, for the system (Equation3) with the output Equation (Equation5), a feedback controller in the form of (Equation8), which minimise the index J3, is given by (35) {u=B1(xk(0μk1)|k=1η,ζ,t)×[f(xk(0μk1)|k=1η,ζ,t)v]v=R1BETPxk(0μk1)|k=1η(35) where PRϰ×ϰ is the unique solution to the algebraic Riccati Equation (Equation32). Furthermore, the closed-loop system (Equation34) is linear and asymptotically stable, and the minimum index is given by (36) J=12xk(0μk1)(0)|k=1ηP2.(36)

Regarding the above result, we make the following remark.

Remark 4.2

Due to the above Theorem 4.3, the optimal controller (Equation35) may be viewed as a stabilising controller for the system (Equation3). However, different from a general stabilising controller for the system, the controller (Equation35) also assures that the state vector xk(0μk1)(0)|k=1η of the closed-loop system goes to zero with a smooth and steady transient performance, and, particularly, xk(0μk1)(t)|k=1ηP converges to zero monotonously.

Remark 4.3

Generally speaking, it is hard to achieve global stabilisation (including infinite-time optimal control) of nonlinear systems. However, as demonstrated by the above Theorem 4.3, once a nonlinear system is represented in a HOFA system form, global stabilisation can be easily realised. What is more, the closed-loop system is even linear (which obviously implies global exponential stabilisation). This fact again demonstrates the great advantage of the HOFA design approaches.

5. The sub-fully actuated case

Firstly, let us recall the concept of singular points of sub-fully actuated systems, introduced in Duan (Citation2020j).

5.1. Set of singular points

For a sub-fully actuated system in the form of (Equation1), the following concept is essential.

Definition 5.1

If xk(0μk1)(t)|k=1ηRϰ satisfies (37) detB(xk(0μk1)|k=1η,ζ(t),t)=0 or ,(37) then it is called a singular point of system (Equation1) at time t.

Let St be the set of all singular points of system (Equation1) at time t, that is, St={xk(0μk1)(t)|k=1η|Equation~(37) holds}.Then we call, particularly, S0={xk(0μk1)(0)|k=1η|xk(0μk1)Equation~(37) holds at t=0},or, equivalently, S0={St|t=0},the set of singular initial value points of system (Equation1). Furthermore, the following set S={St|t0},is called the set of singular points of system (Equation1). Clearly, for a HOFA system (Equation1) satisfying Assumption A1, there obviously holds S=S0=St=,t0.Let us define F=RϰS,then F is called the set of feasible points of system (Equation1). Similarly, the following sets Ft=RϰSt,and F0=RϰS0,are called the set of feasible points of system (Equation1) at time t, and the set of feasible initial value points of system (Equation1), respectively. In general, the system (Equation1) is called a sub-fully actuated system if F is a set with dimension not less than 1 (Duan, Citation2020j).

5.2. Optimal output regulation

With the above preparation, we can now solve the optimal control problem of sub-fully actuated systems. For simplicity, let us again illustrate the idea with the output regulation problem, but with Assumption A1 replaced with the following one.

Assumption A3

The set of singular points, S, meets the following two conditions:

  1. it does not depend on time t, that is, S=St=S0,t0;

  2. it does not contain the origin, that is, d0=inf{z|zS}>0.

Obviously, the above d0 is clearly the distance of the set S from the origin. As a consequence of the above Assumption A3, we also have F=F0=Ft=RϰS,t0.With the above preparation, the optimal control problem to be solved can be now stated as follows.

Problem 5.1

Let the HOFA system (Equation3), with the output Equation (Equation5), satisfy Assumptions A2 and A3, and QRm×m be a semi-positive definite matrix, and RRr×r a positive definite one. Find a feedback controller in the form of (Equation8) and (Equation30), such that

(1)

the index J3 given by (Equation28) is minimised and

(2)

the following feasibility condition is met: (38) xk(0μk1)|k=1ηF.(38)

To solve the above problem, we need to present the following important lemma about stabilisation of sub-fully actuated systems.

Lemma 5.2

Let Assumption A3 be met. Further, let KRr×ϰ be a matrix making Ac=AE+BEKHurwitz, and PRϰ×ϰ be a positive definite matrix satisfying the Lyapunov matrix equation AcTPc+PcAc+DcTDc=0,where DcRq×ϰ, qϰ, and [A,Dc] is observable. Then,

(1)

the following feedback controller (39) {u=B1(xk(0μk1)|k=1η,ζ,t)×[f(xk(0μk1)|k=1η,ζ,t)v]v=Kxk(0μk1)|k=1η(39) stabilises the HOFA system (Equation3) and

(2)

if, further, the initial values are chosen to satisfy (40) ν(Pc)xk(0μk1)(0)|k=1η<d0,(40) then the feasibility condition (Equation38) is also met.

Proof.

Under given conditions, the first conclusion can be easily proven, now we need only to show that the feasibility requirement (Equation38) is met when condition (Equation40) holds.

First of all, since ν(Pc)1, it follows from (Equation40) that (41) xk(0μk1)(0)|k=1η<1ν(Pc)d0d0.(41) Thus we have (42) xk(0μk1)|k=1η(0)F.(42) Next, let us define V(t)=xk(0μk1)(t)|k=1ηPc2,then, following a typical treatment in stability analysis of linear systems, we can prove that V˙(t)0.Therefore, V(t)V(0),t0,that is, (43) xk(0μk1)(t)|k=1ηPc2xk(0μk1)(0)|k=1ηPc2,t0.(43) From this we obtain, for t0, λmin(Pc)xk(0μk1)(t)|k=1η2λmax(Pc)xk(0μk1)(0)|k=1η2,which further gives, together with condition (Equation40), the following xk(0μk1)(t)|k=1η2λmax(Pc)λmin(Pc)xk(0μk1)(0)|k=1η2=PcPc1xk(0μk1)(0)|k=1η2=ν(Pc)xk(0μk1)(0)|k=1η2d02.This implies the relation (Equation38). The proof is then completed.

Based on Theorem 4.3 and the above lemma, a solution to the above Problem 5.1 can be given as follows.

Theorem 5.3

Let Assumptions A2 and A3 be met, and QRm×m be semi-positive definite, and RRr×r be positive definite. Further, let the feedback controller in the form of (Equation8) for the system (Equation3) be given by (Equation35), with PRϰ×ϰ being the unique solution to the algebraic Riccati Equation (Equation32). If, further, the initial values are chosen to satisfy (44) ν(P)xk(0μk1)(0)|k=1η<d0,(44) then the two requirements in Problem 5.1 are both met.

Proof.

Denote (45) K=R1BETP,(45) (46) Ac=AE+BEK,(46) then, by Theorem 4.3, Ac is Hurwitz. Further note AcTP+PAc=AETP+PAE2PBER1BETP,it is clear that the Riccati Equation (Equation32) is equivalent to the following Lyapunov matrix equation (47) AcTP+PAc+Qc=0,(47) with (48) Qc=CTQC+PBER1BETP.(48) Due to Assumption A2, we have Q=DTD, and [AE,DC] is observable. Further let R1=TTT,we can write (49) Qc=(DC)TDC+PBETTTBETP=[(DC)TPBET][DC(PBET)T].(49) Remembering that [AE,DC] is observable, by the well-known PBH criterion we have rank[sIAEDC]=ϰ,sC.Using this relation, we immediately have (50) rank[sIAEDC(PBET)T]=ϰ,sC,(50) that is, the matrix pair [AE,[DC(PBET)T]]is also observable by the PBH criterion again. Therefore, the conclusion of the theorem can now be easily drawn by combining Theorem 4.3 and Lemma 5.2.

5.3. Further discussions

This section further looks into two circumstances.

5.3.1. Case of S containing the origin

For simplicity, let us consider the following time-invariant HOFA system: (51) [x1(μ1)x2(μ2)xη(μη)]=[f1(xk(0μk1)|k=1η)f2(xk(0μk1)|k=1η)fη(xk(0μk1)|k=1η)]+B(xk(0μk1)|k=1η)u.(51) By definition, its set of singular points is Sx={xk(0μk1)|k=1η|detB(xk(0μk1)|k=1η)=0 or }.When the above set of singular points contains the origin, we can introduce a state transformation (52) zk(0μk1)|k=1η=xk(0μk1)|k=1ηq,(52) where qRϰ is a constant vector. Under the above transformation, the above system (Equation51) is converted into (53) [z1(μ1)z2(μ2)zη(μη)]=[f1(zk(0μk1)|k=1η,q)f2(zk(0μk1)|k=1η,q)fη(zk(0μk1)|k=1η,q)]+B(zk(0μk1)|k=1η,q)u.(53) Often with a proper choice of the constant vector q, the set of singular points of the above new system (Equation53), namely, Sz, does not contain the origin. Hence for the system (Equation53) Assumption A3 holds. Therefore, the optimal control of system (Equation51) can then be solved by applying the above Theorem 5.3 to the transformed system (Equation53).

5.3.2. Decoupled designs

It is easily recognised from the proof of Theorem 5.3 that the subset of the feasible initial values given by Theorem 5.3, that is, the set of points satisfying condition (Equation44), may be conservative. In certain cases, e.g. when detB(xk(0μk1)|k=1η) is dependent on only some, but not all, xk's and their derivatives, less conservative initial value ranges can be provided.

For convenience, let us simply assume that detB(xk(0μk1)|k=1η) is dependent on only xη(0μη1), that is, there exists a scalar function h(xη(0μη1)) such that detB(xk(0μk1)|k=1η)=h(xη(0μη1)).In this case, instead of converting (Equation51) into a whole system (Equation14), we can convert (Equation51) into two subsystems in state–space forms, as (54) x˙η(0μη1)=Φη(00μη1)xη(0μη1)+Bηcvη,(54) and (55) x˙k(0μk1)|k=1η1=AExk(0μk1)|k=1η1+BEv1η1,(55) where v1η1=[v1v2vη1],AE=blockdiag(Φk(00μk1),k=1,2,,η1),BE=blockdiag(Bkc,k=1,2,,η1).By now, in the Step III of Section 3, we can apply the linear quadratic regulation control technique to both systems (Equation54) and (Equation55) separately, as treated in the proof of Theorem 3.1. As a consequence, the initial values of the system (Equation55) can be freely chosen, while only the initial values of the system (Equation54) should be properly restricted to meet the feasibility of the controller (Equation8), as treated in the proof of the above Theorem 5.3. In such a way, a much more tight subset of feasible initial values can be provided. The next section gives a demonstration of this idea with an application to spacecraft attitude control.

6. Spacecraft attitude control

6.1. The system model

Consider the modelling of the attitude system of a spacecraft. The coordinate system on the spacecraft is shown in Figure . The origin of the coordinate system is taken to be the centre of the spacecraft, the z-axis takes the direction from the centre of the spacecraft to the earth centre, the x-axis goes along with the flight direction of the spacecraft, and the y-axis is determined by the right-hand coordinate system.

Figure 1. Coordinate system transformation.

Figure 1. Coordinate system transformation.

Denote by ψ the yaw angle, φ the roll angle, and ϑ the pitch angle, and put (56) q=[φϑψ],(56) then, when the order of rotation of the coordinate system is Z(ψ)X(φ)Y(ϑ), the dynamical model for the spacecraft attitude system can be written in the following second-order matrix form (Duan, Citation2014): (57) M(q)q¨+D(q,q˙)q˙+ξ(q,q˙)=u,(57) where u is the torque vector, and (58) M(q)=[Ixcosϑ0Ixcosφsinϑ0IyIysinφIzsinϑ0Izcosφcosϑ],(58) the functions D(q,q˙) and ξ(q,q˙) are given in the appendix. Thus the system can be rewritten in the following standard HOFA system form: (59) q¨=f(q,q˙)+B(q)u,(59) where (60) f(q,q˙)=M1(q)[D(q,q˙)q˙+ξ(q,q˙)],(60) (61) B(q)=M1(q).(61) Note that detB(q)=1IxIyIzcosφ,the above system (Equation59)–(Equation61) has the following set of singular points: (62) S={[φϑψ]|φ=±(2k+1)2π,k=0,1,2,}.(62) Clearly, the distance of S from the origin is d0=π2.

6.2. Optimal control design

The design objective is to stabilise the HOFA system (Equation59)–(Equation61) with a nonlinear controller in the form of (Equation8), with a particular intension of keeping the spacecraft moving smoothly and steadily. Theoretically, this requires q,q˙ and also q¨ to be all as small as possible, and a proper choice of the index in the form of J3 in (Equation28) will adequately meet this need. Therefore, the optimal control problem for the system (Equation59)–(Equation61) can be well established and solved by applying Theorem 5.3.

However, in order to obtain a larger feasible set of initial values to cope with the singularity in the system, we here adopt the decoupled design idea illustrated in Subsection 5.3.2, and minimise separately the following two indices: (63) Ja=120((φ(01))T[q1q0q0q2]φ(01)+φ¨2)dt,(63) (64) Jb=120([ϑ(01)ψ(01)]TQb[ϑ(01)ψ(01)]+[ϑ¨ψ¨]TRb[ϑ¨ψ¨])dt,(64) where Rb is positive definite, while Qa=[q1q0q0q2] and Qb are semi-positive definite.

With the following controller u=M(q)(f(q,q˙)[vavb]),the system is turned into two separate linear subsystems: (65) φ¨=va,(65) and (66) [ϑ¨ψ¨]=vb.(66) Their state-space forms are, respectively, (67) φ˙(01)=[0100]φ(01)+[01]va,(67) and (68) [ϑ˙(01)ψ˙(01)]=AE[ϑ(01)ψ(01)]+BEvb,(68) where (69) AE=[0100000000010000],BE=[00100001].(69) Therefore, the controller finally designed for the system is (70) {u=M(q)(f(q,q˙)[vavb])va=Kaφ(01)vb=Kb[ϑ(01)ψ(01)],(70) where (71) Kb=Rb1BETPb,(71) with Pb satisfying the Riccati equation (72) AETPb+PbAEPbBERb1BETPb+Qb=0,(72) and (73) Ka=[01]Pa=[p0p2],(73) with Pa given by (74) Pa=[p1p0p0p2](74) and (75) {p0=q1p2=q2+2p0p1=p0p2q0.(75) The closed-loop system is composed of (76) φ˙(01)=[01p0p2]φ(01)(76) and (77) [ϑ˙(01)ψ˙(01)]=(AEBEKb)[ϑ(01)ψ(01)].(77) In order to meet the feasibility of the system (Equation59)–(Equation61), by Theorem 5.3 it is sufficient to require the following initial value restriction: (78) ν(Pa)(φ2(0)+φ˙2(0))<π2.(78)

6.3. Simulation results

Consider a spacecraft with the following moments of inertia (Gao et al., Citation2013): (79) Ix=18,Iy=21,Iz=24.(79) The weighting matrices Qa, Qb and Rb are chosen to be (80) Qa=[1002.5],Qb=γI4,Rb=I2,(80) where γ is a positive parameter.

According to (Equation73) and (Equation74), the matrices Ka and Pa are solved as (81) Ka=[12.1213](81) and (82) Pa=[2.1213112.1213],(82) respectively. In such a case, we have (83) ν(Pa)=2.7836,(83) and hence, according to (Equation78), the initial values of φ(01) needs to satisfy (84) φ2(0)+φ˙2(0)<0.9415.(84)

6.3.1. Attitude stabilisation

Choose γ=1, we can obtain (85) Pb=[1.732110011.732100001.732110011.7321].(85) hence the feedback gain can be obtained as (86) Kb=[11.7321000011.7321].(86) Case A1: When the initial values are taken as (87) φ(0)=0.5236,φ˙(0)=0.2104,(87) (88) ϑ(0)=23π,ϑ˙(0)=1.2,(88) and (89) ψ(0)=π8,ψ˙(0)=π12,(89) the simulation of the designed control system has been carried out and the results are shown in Figure .

Figure 2. Simulation results for attitude stabilisation, Case A1.

Figure 2. Simulation results for attitude stabilisation, Case A1.

Case A2: When only the initial values in (Equation89) is replaced to (90) ψ(0)=0.9,ψ˙(0)=π4,(90) while keeping the other ones unchanged as in Case A1, the simulation results are shown in Figure .

Figure 3. Simulation results for attitude stabilisation, Case A2.

Figure 3. Simulation results for attitude stabilisation, Case A2.

The simulation results clearly indicate the following:

  1. as desired, in both cases the control technique provides very smooth and steady system responses and

  2. a slight change in the initial values ψ(01)(0) causes a big change in the corresponding control input u1.

The above second phenomenon reflects the difference between nonlinear and linear systems. The reason lies in the fact that, although the two closed-loop linear subsystems are decoupled, all the control inputs are still closely coupled via the f() function.

6.3.2. Rapid turning elimination

In this section, we testify the turning elimination control of the spacecraft. Let us assume that the spacecraft is turning in the θ and ψ directions with very high rates. Specifically, this is reflected in the following choice of the initial values: (91) φ(0)=0.5236,φ˙(0)=0.2104,(91) (92) ϑ(0)=5π,ϑ˙(0)=5,(92) and (93) ψ(0)=2π,ψ˙(0)=4.(93) With such big initial values, it is impractical to stop the turning of the spacecraft within too short time. Therefore, we decrease the weighting matrix Qb in (Equation80) by reducing the factor γ.

Case B1. In the case of γ=0.1, we can obtain (94) Kb=[0.31620.855800000.31620.8558],(94) and the simulation results are shown in Figure .

Figure 4. Simulation results for turning elimination, Case B1.

Figure 4. Simulation results for turning elimination, Case B1.

Case B2. In the case of γ=0.01, we can obtain (95) Kb=[0.10.458300000.10.4583],(95) and the corresponding simulation results are shown in Figure .

Figure 5. Simulation results for turning elimination, Case B2.

Figure 5. Simulation results for turning elimination, Case B2.

It is clearly seen from Figures and the following:

  1. Very effective rapid turning elimination in both Cases B1 and B2 is achieved with smooth transient process. Particularly, the turning in Case B1 is stopped in approximately 10 seconds, and that in Case B2 is stopped in approximately 25 seconds.

  2. In both cases, the control amplitudes required are still in a reasonable range. The control amplitude can be accordingly reduced if the time required to stop the turning is relaxed.

7. Conclusion

Except the well-known linear quadratic optimal control technique, general nonlinear optimal control in the state–space framework still remains a big problem in the field of systems and control. Theoretical results are difficult to apply in practice due to certain nonlinear equations involved.

Parallel to the state–space approaches, HOFA approaches have been recently proposed and demonstrated to be much more effective in dealing with system control problems (Duan, Citation2020aCitation2020bCitation2020c) and (Duan, Citation2020dCitation2020eCitation2020fCitation2020gCitation2020hCitation2020i, Citation2020j). It is further shown in this paper that, with the proposed HOFA approach, a type of general nonlinear optimal control can be well proposed and solved.

The problem is formulated to minimise an index in a quadratic form of the state and its derivatives of certain orders. It is shown that, with the help of the full-actuation feature of the HOFA models, the problem is linked to an optimal control problem of a linear system. Therefore, a nonlinear state feedback optimal controller can then be easily solved based on the solution to a Riccati differential or algebraic equation.

It is also shown that the formulation can be conveniently generalised to the case of sub-fully actuated systems. It turns out that the feasibility of the controller can be met by only restricting the system initial values to be within a proper ball area.

Spacecraft attitude systems are highly nonlinear ones when the attitude angles are not restricted to vary within very small ranges. Therefore, spacecraft attitude control with smooth and steady transient processes and accurate attitude manoeuvring has always remained a very hard problem. The proposed approach has been demonstrated to provide a very effective and simple way to handle such nonlinear spacecraft attitude control problems.

The results in this paper can be extended to other types of systems, e.g. discrete-time systems and uncertain systems. In addition, how to reduce the control magnitude in nonlinear optimal control certainly remains a very difficult problem.

Acknowledgments

The author is grateful to his Ph.D. students Guangtai Tian, Qin Zhao, Xiubo Wang, Weizhen Liu, Kaixin Cui, and Liyao Hu, and also his visiting scholar, Prof. Yang Cui, for helping him with reference selection and proofreading. His particular thanks go to his students Bin Zhou and Tianyi Zhao for their useful suggestions and help with the simulations, respectively.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work has been partially supported by the Major Program of National Natural Science Foundation of China [grant numbers 61690210, 61690212], the National Natural Science Foundation of China [grant number 61333003] and the Self-Planned Task of State Key Laboratory of Robotics and System (HIT) [grant number SKLRS201716A].

Notes on contributors

Guangren Duan

Guangren Duan received his Ph.D. degree in Control Systems Sciences from Harbin Institute of Technology, Harbin, P. R. China, in 1989. After a 2-year post-doctoral experience at the same university, he became professor of control systems theory at that university in 1991. He is the founder and currently the director of the Center for Control Theory and Guidance Technology at Harbin Institute of Technology. He visited the University of Hull, the University of Sheffield, and also the Queen's University of Belfast, UK, from December 1996 to October 2002, and has served as Member of the Science and Technology committee of the Chinese Ministry of Education, Vice President of the Control Theory and Applications Committee, Chinese Association of Automation (CAA), and Associate Editors of a few international journals. He is currently an academician of the Chinese Academy of sciences, and fellow of CAA, IEEE and IET. His main research interests include parametric control systems design, nonlinear systems, descriptor systems, spacecraft control and magnetic bearing control. He is the author and co-author of 5 books and over 270 SCI indexed publications.

References

Appendix. Other terms in model (57)

The terms ξ(q,q˙) and D(q,q˙) in the spacecraft attitude system (Equation57) are given by (A1) ξ(q,q˙)=[ω02(IzIy)πxω02(IxIz)πyω02(IyIx)πz](A1) and (A2) D(q,q˙)=[D1(q,q˙)D2(q,q˙)D3(q,q˙)],(A2) with (A3) D1(q,q˙)=[(Ix+IzIy)πxφIyπyφa+(IxIz)πyφb(Ix+IzIy)πzφ],(A3) (A4) D2(q,q˙)=[(Ix+IyIz)πxϑ0(Iz+IyIx)πzϑ],(A4) (A5) D3(q,q˙)=[Ixω0πxψa+(IzIy)πxψbIyω0πyψa+(IxIz)πyψbIxπzψa+(IyIx)πzψb],(A5) where in the above series of equations, (A6) ω0=7.292115×105,(A6) is the rotational-angular velocity of the earth, and the set of π- functions appeared are given by (A7) {πxφ=ϑsinϑ+ψ˙sinφsinϑω0cosφsinϑcosψπxϑ=ω0sinϑsinψψ˙cosφcosϑω0sinφcosϑcosψπxψa=sinφsinϑsinψcosϑcosψπxψb=ψ˙sinφcosφcosϑω0sinφsinϑsinψ+ω0sin2φcosϑcosψω0cos2φcosϑcosψπx0=cosφsinϑsinψcosψsinφcosφcosϑcos2ψπx=πx032sin2φcosϑ,(A7) (A8) {πyφa=ω0sinφcosψ+ψ˙cosφπyφb=ψ˙cosφcos2ϑψ˙cosφsin2ϑ+φ˙sinϑcosϑ+ω0sinφcos2ϑcosψω0sinφsin2ϑcosψ2ω0sinϑsinψcosϑπyψa=cosφsinψπyψb=ω0cosφsin2ϑsinψω0cosφcos2ϑsinψψ˙cos2φcosϑsinϑ2ω0sinφsinϑcosφcosϑcosψπy0=cosϑsinϑsin2ψ+sinφsin2ϑsinψcosψsinφ×cos2ϑcosψsinψsin2φsinϑcosϑcos2ψπy=πy0+32cos2φsin2ϑ,(A8)

and (A9) {πzφ=ω0cosφcosϑcosψψ˙sinφcosϑπzϑ=φ˙cosϑψ˙cosφsinϑω0cosϑsinψω0sinφsinϑcosψπzψa=ω0sinϑcosψω0sinφcosϑsinψπzψb=ω0cos2φsinϑcosψψ˙sinφcosφsinϑω0sinφcosϑsinψω0sin2φsinϑcosψπz0=cosφcosϑcosψsinψ+cosφsinφsinϑcos2ψπz=πz0+32sin2φsinϑ.(A9)