1,629
Views
98
CrossRef citations to date
0
Altmetric
Original Articles

Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics

, , , &
Pages 99-112 | Received 13 Nov 2014, Accepted 04 Jun 2015, Published online: 31 Jul 2015

References

  • Abu-Khalaf, M., & Lewis, F.L. (2005). Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 41(5), 779–791
  • Al-Tamimi, A., Lewis, F.L., & Abu-Khalaf, M. (2008). Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38(4), 943–949
  • Bellman, R.E. (1957). Dynamic programming. Princeton, NJ: Princeton University Press.
  • Bhasin, S., Kamalapurkar, R., Johnson, M., Vamvoudakis, K.G., Lewis, F.L., & Dixon, W.E. (2013). A novel actor–critic–identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica, 49(1), 82–92
  • Bhat, S.P., & Bernstein, D.S. (1998). Continuous finite-time stabilization of the translational and rotational double integrators. IEEE Transactions on Automatic Control, 43(5), 678–682
  • Dierks, T., & Jagannathan, S. (2012). Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update. IEEE Transactions on Neural Networks and Learning Systems, 23(7), 1118–1129
  • Doya, K. (2000). Reinforcement learning in continuous time and space. Neural computation, 12(1), 219–245
  • Farza, M., Sboui, A., Cheerier, E., & M'Saad, M. (2010). High-gain observer for a class of time-delay nonlinear systems. International Journal of Control, 83(2), 273–280
  • Hanselmann, T., Noakes, L., & Zaknich, A. (2007). Continuous-time adaptive critics. IEEE Transactions on Neural Networks, 18(3), 631–647
  • Jiang, Y., & Jiang, Z.-P. (2012). Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica, 48(10), 2699–2704
  • Jung, J.C., Huh, K., & Lee, T.H. (2008). Observer design methodology for stochastic and deterministic robustness. International Journal of Control, 81(7), 1172–1182
  • Kamalapurkar, R., Walters, P., & Dixon, W. (2013, December). Concurrent learning-based approximate optimal regulation. In 52nd IEEE Conference on Decision and Control December (pp. 6256–6261), Florence, Italy. Florence: IEEE.
  • Lewis, F.L., & Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine, 9(3), 32–50
  • Lewis, F.L., Vrabie, D., & Syrmos, V.L. (2012). Optimal control. New York: Wiley.
  • Liu, D., Huang, Y., Wang, D., & Wei, Q. (2013). Neural-network-observer-based optimal control for unknown nonlinear systems using adaptive dynamic programming. International Journal of Control, 86(9), 1554–1566
  • Liu, D., & Wei, Q. (2013). Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems. IEEE Transactions on Cybernetics, 43(2), 779–789
  • Liu, Y.-J., Tang, L., Tong, S., Chen, C.L., & Li, D.-J. (2015). Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems. IEEE Transactions on Neural Networks and Learning Systems, 26(1), 165–176
  • Modares, H., Lewis, F.L., & Naghibi-Sistani, M.B. (2013). Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks. IEEE Transactions on Neural Networks and Learning Systems, 24(10), 1513–1525
  • Na, J., & Herrmann, G. (2014). Online adaptive approximate optimal tracking control with simplified dual approximation structure for continuous-time unknown nonlinear systems. IEEE/CAA Journal of Acta Automatica Sinica, 1(4), 412–422
  • Na, J., Herrmann, G., Ren, X., Mahyuddin, M.N., & Barber, P. (2011, September). Robust adaptive finite-time parameter estimation and control of nonlinear systems. In IEEE International Symposium on Intelligent Control (ISIC) (pp. 1014–1019). Denver, CO, USA.
  • Na, J., Ren, X., & Zheng, D. (2013). Adaptive control for nonlinear pure-feedback systems with high-order sliding mode observer. IEEE Transactions on Neural Networks and Learning Systems, 24(3), 370–382
  • Nevistic, V., & Primbs, J.A. (1996). Constrained nonlinear optimal control: A converse HJB approach ( Technical Report CIT-CDS 96-02). Pasadena, CA: California Institute of Technology.
  • Ni, Z., & He, H. (2013). Adaptive learning in tracking control based on the dual critic network design. IEEE Transactions on Neural Networks and Learning Systems, 24(6), 913–928
  • Qin, C., Zhang, H., & Luo, Y. (2014). Online optimal tracking control of continuous-time linear systems with unknown dynamics by using adaptive dynamic programming. International Journal of Control, 87(5), 1000–1009
  • Ren, X.M., Lewis, F.L., & Zhang, J. (2009). Neural network compensation control for mechanical systems with disturbances. Automatica, 45(5), 1221–1226
  • Sastry, S., & Bodson, M. (1989). Adaptive control: Stability, convergence and robustness. Englewood Cliffs, NJ: Prentice Hall.
  • Si, J., Barto, A.G., Powell, W.B., & Wunsch, D.C. (2004). Handbook of learning and approximate dynamic programming. Los Alamitos: Wiley-IEEE.
  • Sutton, R.S., & Barto, A.G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
  • Utkin, V.I. (1992). Sliding modes in control and optimization. Berlin: Springer-Verlag.
  • Vamvoudakis, K.G., & Lewis, F.L. (2010). Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 46(5), 878–888
  • Vrabie, D., & Lewis, F. (2009). Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Networks, 22(3), 237–246
  • Vrabie, D., Pastravanu, O., Abu-Khalaf, M., & Lewis, F.L. (2009). Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 45(2), 477–484
  • Wang, D., Liu, D., Wei, Q., Zhao, D., & Jin, N. (2012). Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming. Automatica, 48(8), 1825–1832
  • Wang, F.-Y., Zhang, H., & Liu, D. (2009). Adaptive dynamic programming: An introduction. IEEE Computational Intelligence Magazine, 4(2), 39–47
  • Werbos, P.J. (1990). A menu of designs for reinforcement learning over time. In W.T. Miller, R.S. Sutton, & P.J. Werbos (Eds.), Neural networks for control (pp. 67–95). Cambridge, MA: MIT Press.
  • Werbos, P.J. (1992). Approximate dynamic programming for real time control and neural modeling. In D.A. White & D.A. Sofge (Eds.), Handbook of intelligent control: Neural, fuzzy, and adaptive approaches (pp. 67–95): New York, NY: Van Nostrand Reinhold.
  • Xu, B., Yang, C., & Shi, Z. (2014). Reinforcement learning output feedback NN control using deterministic learning technique. IEEE Transactions on Neural Networks and Learning Systems, 25(3), 635–641
  • Yang, Q., & Jagannathan, S. (2012). Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 42(2), 377–390
  • Yang, Q., Vance, J.B., & Jagannathan, S. (2008). Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38(4), 994–1001
  • Yang, X., Liu, D., & Wang, D. (2014). Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints. International Journal of Control, 87(3), 553–566
  • Zhang, H., Cui, L., Zhang, X., & Luo, Y. (2011). Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Transactions on Neural Networks, 22(12), 2226–2236
  • Zhang, H., Song, R., Wei, Q., & Zhang, T. (2011). Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Transactions on Neural Networks, 22(12), 1851–1862

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.