385
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Robustness improvement of optimal control in terms of RBFNN with empirical model reduction and transfer learning

, , , &
Received 06 Mar 2023, Accepted 05 Mar 2024, Published online: 21 Mar 2024

Figures & data

Figure 1. Flowchart of the RBFNN optimal control algorithm with model reduction and transfer learning for linear systems.

Figure 1. Flowchart of the RBFNN optimal control algorithm with model reduction and transfer learning for linear systems.

Figure 2. Quanser-Servo2 Inverted Pendulum system hardware setup (Quanser, Citation2022).

Figure 2. Quanser-Servo2 Inverted Pendulum system hardware setup (Quanser, Citation2022).

Table 1. Parameters of the rotary pendulum system.

Table 2. Summary of reduced order model matrices from model-based gramians and empirical gramians.

Table 3. Hankel singular values of the rotary pendulum.

Figure 3. Top: Relative output errors of the reduced order model by the balanced truncation and empirical balanced truncation. Below: The input signal.

Figure 3. Top: Relative output errors of the reduced order model by the balanced truncation and empirical balanced truncation. Below: The input signal.

Figure 4. Tracking the square wave of the rotary army in simulations. In the legend, ‘LQR’ denotes the LQR control designed with the original model; ‘RBFNN’ denotes the RBFNN control designed with the model-based BT; ‘Empirical RBFNN’ denotes the RBFNN control designed with the empirical BT.

Figure 4. Tracking the square wave of the rotary army in simulations. In the legend, ‘LQR’ denotes the LQR control designed with the original model; ‘RBFNN’ denotes the RBFNN control designed with the model-based BT; ‘Empirical RBFNN’ denotes the RBFNN control designed with the empirical BT.

Figure 5. Comparisons of tracking response θ(t) of the rotary army in simulations. In the legend, ‘RBFNN_Empirical’ denotes the RBFNN control designed with the empirical BT; ‘LQR_Empirical BT’ denotes the LQR control designed with the empirical BT; ‘LQR_Model-based BT’ denotes the LQR control designed with the model-based BT.

Figure 5. Comparisons of tracking response θ(t) of the rotary army in simulations. In the legend, ‘RBFNN_Empirical’ denotes the RBFNN control designed with the empirical BT; ‘LQR_Empirical BT’ denotes the LQR control designed with the empirical BT; ‘LQR_Model-based BT’ denotes the LQR control designed with the model-based BT.

Figure 6. Comparisons of tracking response α(t) of the rotary army in simulations. Legends are the same as in Figure .

Figure 6. Comparisons of tracking response α(t) of the rotary army in simulations. Legends are the same as in Figure 5.

Figure 7. The closed-loop tracking responses θ(t) of the rotary arm of Quanser-Servo2. Legends are the same as in Figure .

Figure 7. The closed-loop tracking responses θ(t) of the rotary arm of Quanser-Servo2. Legends are the same as in Figure 4.

Figure 8. The closed-loop responses θ(t) of the rotary arm under various controls for balancing the inverted pendulum of Quanser-Servo2. Top: Responses before retraining. Bottom: Responses after retraining. Legends are the same as in Figure .

Figure 8. The closed-loop responses θ(t) of the rotary arm under various controls for balancing the inverted pendulum of Quanser-Servo2. Top: Responses before retraining. Bottom: Responses after retraining. Legends are the same as in Figure 4.

Figure 9. The closed-loop tracking response θ(t) of the rotary arm of Quanser-Servo2. Legends are the same as in Figure .

Figure 9. The closed-loop tracking response θ(t) of the rotary arm of Quanser-Servo2. Legends are the same as in Figure 4.

Figure 10. The closed-loop response α(t) of the pendulum in the rotary arm tracking control of Quanser-Servo2. Legends are the same as in Figure .

Figure 10. The closed-loop response α(t) of the pendulum in the rotary arm tracking control of Quanser-Servo2. Legends are the same as in Figure 4.

Table 4. Summary of control performance for LQR, RBFNN, empirical RBFNN, retrained RBFNN and retrained empirical RBFNN.

Figure 11. Robustness comparisons of all the controls under consideration. Top: The closed-loop angle response θ(t) of the rotary arm in balancing control of Quanser-Servo2. Bottom: Disturbance d(t). Legends are the same as in Figure .

Figure 11. Robustness comparisons of all the controls under consideration. Top: The closed-loop angle response θ(t) of the rotary arm in balancing control of Quanser-Servo2. Bottom: Disturbance d(t). Legends are the same as in Figure 4.

Figure 12. Disturbances to the second order nonlinear system.

Figure 12. Disturbances to the second order nonlinear system.

Figure 13. Performance comparison of RBFNN, Poly-NN control and LQR controls for the nonlinear system in Equation (Equation61). Top: The control u(t). Middle: The response x1(t). Bottom: The response x2(t).

Figure 13. Performance comparison of RBFNN, Poly-NN control and LQR controls for the nonlinear system in Equation (Equation61(61) x˙1=x1+x2−x1(x12+x22)x˙2=−x1+x2−x2(x12+x22)+u(61) ). Top: The control u(t). Middle: The response x1(t). Bottom: The response x2(t).

Figure 14. Comparison of spatial distribution of RBFNN and Poly-NN optimal controls u as a function of the state x. Left: The control u(x) plotted in the training region Xs1[1,1]×[1,1]. Right: The control u(x) plotted beyond the training region into the larger region Xs2[2,2]×[2,2].

Figure 14. Comparison of spatial distribution of RBFNN and Poly-NN optimal controls u as a function of the state x. Left: The control u(x) plotted in the training region Xs1∈[−1,1]×[−1,1]. Right: The control u(x) plotted beyond the training region into the larger region Xs2∈[−2,2]×[−2,2].

Figure 15. Robustness of the RBFNN and LQR controls with respect to the model uncertainty β. The vertical dash lines mark the critical value of β, beyond which the closed-loop system becomes unstable.

Figure 15. Robustness of the RBFNN and LQR controls with respect to the model uncertainty β. The vertical dash lines mark the critical value of β, beyond which the closed-loop system becomes unstable.

Figure A1. Comparison of RBFNNs and LQR control performances for the linear 2D system. Top: Control u(t). Middle: Response x1(t). Bottom: Response x2(t). The initial condition of the system is x(0)=[1,1]T.

Figure A1. Comparison of RBFNNs and LQR control performances for the linear 2D system. Top: Control u(t). Middle: Response x1(t). Bottom: Response x2(t). The initial condition of the system is x(0)=[1,1]T.