Search in:

International Journal of Control Volume 87, 2014 - Issue 3

Submit an article Journal homepage

2,343

Views

153

CrossRef citations to date

Altmetric

Original Articles

Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints

Xiong YangThe State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing100190, ChinaView further author information

Derong LiuThe State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing100190, ChinaCorrespondence[email protected]
View further author information

Ding WangThe State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing100190, ChinaView further author information

Pages 553-566 | Received 07 Apr 2013, Accepted 20 Sep 2013, Published online: 30 Oct 2013

Cite this article
https://doi.org/10.1080/00207179.2013.848292
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Abu-Khalaf, M., & Lewis, F.L. (2005). Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 41, 779–791.
Web of Science ®Google Scholar
Apostol, T.M. (1974). Mathematical analysis (2nd ed.). Massachusetts: Addison-Wesley.
Google Scholar
Beard, R., Saridis, G., & Wen, J. (1997). Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation. Automatica, 33, 2159–2177.
Web of Science ®Google Scholar
Bellman, R.E. (1957). Dynamic programming. New Jersey: Princeton University Press.
Google Scholar
Bertsekas, D.P., & Tsitsiklis, J.N. (1996). Neuro-dynamic programming. Massachusetts: Athena Scientific.
Google Scholar
Bhasin, S., Kamalapurkar, R., Johnson, M., Vamvoudakis, K.G., Lewis, F.L., & Dixon, W.E. (2013). A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica, 49, 82–92.
Web of Science ®Google Scholar
Dierks, T., Thumati, B.T., & Jagannathan, S. (2009). Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence. Neural Networks, 22, 851–860.
PubMed Web of Science ®Google Scholar
Hayakawa, T., Haddad, W.M., & Hovakimyan, N. (2008). Neural network adaptive control for nonlinear uncertain dynamical systems with asymptotic stability guarantees. IEEE Transactions on Neural Networks, 19, 80–89.
PubMed Web of Science ®Google Scholar
Hornic, K., & Stinchombe, M. (1989). Multilayer feedforward neural networks are universal approximators. Neural Networks, 2, 359–366.
Web of Science ®Google Scholar
Ioannou, P.A., & Sun, J. (1996). Robust adaptive control. New Jersey: Prentice-Hall.
Google Scholar
Kek, S.L., Teo, K.L., & Ismail, A.A. (2010). An integrated optimal control algorithm for discrete-time nonlinear stochastic system. International Journal of Control, 83, 2536–2545.
Web of Science ®Google Scholar
Koshkouei, A.J., Farahi, M.H., & Burnham, K.J. (2012). An almost optimal control design method for nonlinear time-delay systems. International Journal of Control, 85, 147–158.
Web of Science ®Google Scholar
Lewis, F.L., Jagannathan, S., & Yesildirek, A. (1999). Neural network control of robot manipulators and nonlinear systems. London: Taylor & Francis.
Google Scholar
Lewis, F.L., & Liu, D. (2013). Reinforcement learning and approximate dynamic programming for feedback control. New York: John Wiley & Sons, Inc.
Google Scholar
Lewis, F.L., & Syrmos, V.L. (1995). Optimal control. New York: John Wiley & Sons, Inc.
Google Scholar
Lewis, F.L., & Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine, 9, 32–50.
Web of Science ®Google Scholar
Lin, W. (2011). Optimality and convergence of adaptive optimal control by reinforcement synthesis. Automatica, 47, 1047–1052.
Web of Science ®Google Scholar
Lin, W., & Zheng, C. (2012). Constrained adaptive optimal control using a reinforcement learning agent. Automatica, 48, 2614–2619.
Web of Science ®Google Scholar
Liu, D., Li, H., & Wang, D. (2013). Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm. Neurocomputing, 110, 92–100.
Web of Science ®Google Scholar
Liu, D., Yang, X., & Li, H. (2012). Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neural Computing and Applications, doi: 10.1007/s00521-012-1249-y.
Web of Science ®Google Scholar
Liu, D., Wang, D., & Yang, X. (2013). An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs. Information Sciences, 220, 331–342.
Web of Science ®Google Scholar
Liu, D., & Wei, Q. (2013). Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems. IEEE Transactions on Cybernetics, 43, 779–789.
PubMed Web of Science ®Google Scholar
Lyshevski, S.E. (1998). Optimal control of nonlinear continuous-time systems: Design of bounded controllers via generalized nonquadratic functionals. In Proceedings of American Control Conference (pp. 205–209). Philadelphia, PA.
Google Scholar
Modares, H., Lewis, F.L., & Sistani, M. (2012). Online solution of nonquadratic two-player zero-sum games arising in the H∞ control of constrained input systems. International Journal of Adaptive Control and Signal Processing, doi: 10.1002/acs.2348.
Web of Science ®Google Scholar
Murray, J.J., Cox, C.J., Lendaris, G.G., & Saeks, R. (2002). Adaptive dynamic programming. IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications and Reviews, 32, 140–153.
Web of Science ®Google Scholar
Padhi, R., Unnikrishnan, N., Wang, X., & Balakrishnan, S.N. (2006). A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neural Networks, 19, 1648–1660.
PubMed Web of Science ®Google Scholar
Powell, W.B. (2011). Approximate dynamic programming: Solving the curses of dimensionality (2nd ed. ). Hoboken, NJ: Wiley.
Google Scholar
Prokhorov, D.V., & Wunsch, D.C. (1997). Adaptive critic designs. IEEE Transactions on Neural Networks, 8, 997–1007.
PubMed Web of Science ®Google Scholar
Sahnoun, M., Andrieu, V., & Nadri, M. (2012). Nonlinear and locally optimal controllers design for input affine locally controllable systems. International Journal of Control, 85, 159–170.
Web of Science ®Google Scholar
Si, J., & Wang, Y.T. (2001). On-line learning control by association and reinforcement. IEEE Transactions on Neural Networks, 12, 264–276.
PubMed Web of Science ®Google Scholar
Sutton, R.S., & Barto, A.G. (1998). Reinforcement learning–an introduction. Massachusetts: MIT Press.
Google Scholar
Vamvoudakis, K.G., & Lewis, F.L. (2010). Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 46, 878–888.
Web of Science ®Google Scholar
Wang, F.Y., Jin, N., Liu, D., & Wei, Q. (2011). Adaptive dynamic programming for finite horizon optimal control of discrete-time nonlinear systems with ε-error bound. IEEE Transactions on Neural Networks, 22, 24–36.
PubMed Web of Science ®Google Scholar
Wang, D., Liu, D., Wei, Q., Zhao, D., & Jin, N. (2012). Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming. Automatica, 48, 1825–1832.
Web of Science ®Google Scholar
Wang, F.Y., Zhang, H., & Liu, D. (2009). Adaptive dynamic programming: An introduction. IEEE Computational Intelligence Magazine, 4, 39–47.
Web of Science ®Google Scholar
Wei, Q., & Liu, D. (2012). An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state. Neural Networks, 32, 236–244.
PubMed Web of Science ®Google Scholar
Werbos, P.J. (1974). Beyond regression: New tools for prediction and analysis in the behavioral sciences ( Ph.D. thesis). Harvard University, Massachusetts.
Google Scholar
Werbos, P.J. (1977). Advanced forecasting methods for global crisis warning and models of intelligence. General Systems Yearbook, 22, 25–38.
Google Scholar
Werbos, P.J. (1992). Approximate dynamic programming for real-time control and neural modeling. In D.A. White & D.A. Sofge (Eds.), Handbook of intelligent control: Neural, fuzzy, and adaptive approaches. New York: Van Nostrand Reinhold.
Google Scholar
Wu, Q.H. (1995). Reinforcement learning control using interconnected learning automata. International Journal of Control, 62, 1–16.
Web of Science ®Google Scholar
Xin, M., & Pan, H. (2010). Integrated nonlinear optimal control of spacecraft in proximity operations. International Journal of Control, 83, 347–363.
Web of Science ®Google Scholar
Yu, W. (2009). Recent advances in intelligent control systems. London: Springer-Verlag.
Google Scholar
Yu, W., & Li, X. (2001). Some new results on system identification with dynamic neural network. IEEE Transactions on Neural Networks, 2, 412–417.
Google Scholar
Zhang, H., Cui, L., Zhang, X., & Luo, Y. (2011). Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Transactions on Neural Networks, 22, 2226–2236.
PubMed Web of Science ®Google Scholar
Zhang, H., Liu, D., Luo, Y., & Wang D,. (2013). Adaptive dynamic programming for control: Algorithms and stability. London: Springer.
Google Scholar
Zhang, H., Wei, Q., & Liu, D. (2011). An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica, 47, 207–214.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date