Search in:

Advanced search

International Journal of Control Volume 89, 2016 - Issue 3

Submit an article Journal homepage

360

Views

CrossRef citations to date

Altmetric

Original Articles

Stability and monotone convergence of generalised policy iteration for discrete-time linear quadratic regulations

Tae Yoon ChunDepartment of Electrical and Electronic Engineering, Yonsei University, Seoul, KoreaView further author information

Jae Young LeeDepartment of Electrical and Electronic Engineering, Yonsei University, Seoul, KoreaView further author information

Jin Bae ParkDepartment of Electrical and Electronic Engineering, Yonsei University, Seoul, KoreaCorrespondence[email protected]
View further author information

Yoon Ho ChoiDepartment of Electronic Engineering, Kyonggi University, Suwon, Kyonggi-Do, KoreaView further author information

Pages 437-450 | Received 06 Apr 2015, Accepted 01 Aug 2015, Published online: 07 Sep 2015

Cite this article
https://doi.org/10.1080/00207179.2015.1079737
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Al-Tamimi, A., Abu-Khalaf, M., & Lewis, F.L. (2007). Model-free Q-learning designs for discrete-time zero-sum games with application to H∞ control. Automatica, 43(3), 473–481.
Web of Science ®Google Scholar
Al-Tamimi, A., Lewis, F.L., & Abu-Khalaf, M. (2008). Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38(4), 943–949.
PubMed Web of Science ®Google Scholar
Bertsekas, D.P., & Tsitsiklis, J.N. (1996). Neuro-dynamic programming. Belmont, MA: Athena Scientific.
Google Scholar
Bradtke, S.J., Ydstie, B.E., & Barto, A.G. (1994). Adaptive linear quadratic control using policy iteration. In Proceedings of the American Control Conference (pp. 3475–3479). Baltimore, MD: IEEE.
Google Scholar
Brewer, J.W. (1978). Kronecker products and matrix calculus in system theory. IEEE Transactions on Circuits and Systems, 25(9), 772–781.
Google Scholar
Chun, T.Y., Park, J.B., & Choi, Y.H. (2013). Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems. In Proceedings of the 13th International Conference on Control, Automation and Systems (pp. 454–458). Gwangju: IEEE.
Google Scholar
Doya, K., Kimura, H., & Kawato, M. (2001). Neural mechanisms for learning and control. IEEE Control System Magazine, 21(4), 42–54.
Web of Science ®Google Scholar
Enns, R., & Si, J. (2003). Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Transactions on Neural Network, 14(4), 929–939.
PubMed Web of Science ®Google Scholar
Flankin, G.F., Powell, J.D., & Workman, M.L. (1997). Digital control of dynamic systems. Menlo Park, CA: Addison-Wesley.
Google Scholar
Hanselmann, T., Noakes, L., & Zaknich, A. (2007). Continuous-time adaptive critics. IEEE Transactions on Neural Network, 18(3), 631–647.
PubMed Web of Science ®Google Scholar
Kaelbling, L.P., Littman, M.L., & Moore, A.W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
Web of Science ®Google Scholar
Khalil, H.K. (2002). Nonlinear systems. (3rd ed.). Upper Saddle River, NJ: Prentice Hall.
Google Scholar
Kleinman, D. (1968). On the iterative technique for Riccati equation computations. IEEE Transactions on Automatic Control, 13(1), 114–115.
Web of Science ®Google Scholar
Lancaster, P., & Rodman, L. (1995). Algebraic Riccati equations. Oxford: Oxford University Press.
Google Scholar
Landelius, T. (1997). Reinforcement learning and distributed local model synthesis (PhD dissertation). Linkoping University, Sweden.
Google Scholar
Lee, J.Y., Park, J.B., & Choi, Y.H. (2010). A novel generalized value iteration scheme for uncertain continuous-time linear systems. In Proceedings of the 49th IEEE Conference on Decision and Control (pp. 4637–4642). Atlanta, GA: IEEE.
Google Scholar
Lee, J.Y., Park, J.B., & Choi, Y.H. (2012). Approximate dynamic programming for continuous-time linear quadratic regulator problems: Relaxation of known input coupling matrix assumption. IET Control Theory and Applications, 6(13), 2063–2075.
Web of Science ®Google Scholar
Lee, J.Y., Park, J.B., & Choi, Y.H. (2014). On integral generalized policy iteration for continuous-time linear quadratic regulations. Automatica, 50(2), 475–489.
Web of Science ®Google Scholar
Lewis, F.L., & Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and System Magazine, 9(3), 32–50.
Web of Science ®Google Scholar
Powell, W.B. (2007). Approximate dynamic programming: Solving the curses of dimensionality. Hoboken, NJ: John Wiley & Sons.
Google Scholar
Prokhorov, D.V., & Wunsch, D.C. (1997). Adaptive critic designs. IEEE Transactions on Neural Network, 8(5), 997–1007.
PubMed Web of Science ®Google Scholar
Puterman, M.L., & Shin, M.C. (1978). Modified policy iteration algorithms for discounted Markov decision problems. Management Science, 24(11), 1127–1137.
Web of Science ®Google Scholar
Si, J., Barto, A.G., & Powell, W.B. (2004). Handbook of learning and approximate dynamic programming. New York, NY: Wiley-IEEE Press.
Google Scholar
Si, J., Yang, L., & Liu, D. (2012). Direct neural dynamic programming. In J. Si, A.G. Barto, W.B. Powell, & D. Wunsch (Eds.), Handbook of learning and approximate dynamic programming (pp. 125–151). New York, NY: Wiley-IEEE Press.
Google Scholar
Strang, G. (2006). Linear algebra and its applications. CA: Thomson Higher Education.
Google Scholar
Sutton, R.S., & Barto, A.G. (1998). Reinforcement learning – an introduction. Cambridge, MA: MIT Press.
Google Scholar
Vrabie, D. (2010). Online adaptive optimal control for continuous-time systems (PhD thesis). University of Texas at Arlington, TX.
Google Scholar
Vrabie, D., Vamvoudakis, K., & Lewis, F.L. (2009). Adaptive optimal controllers based on generalized policy iteration in a continuous-time framework. In Proceedings of the 17th Mediterranean Conference on Control and Automation (pp. 1402–1409). Thessaloniki: IEEE.
Google Scholar
Werbos, P.J. (1992). Approximate dynamic programming for real-time control and neural modelling. In D.A. White & D.A. Sofge (Eds.), Handbook of intelligent control. (pp. 493–525). New York, NY: Van Nostrand Reinhold.
Google Scholar
Wong, W.C., & Lee, J.H. (2008). A reinforcement learning-based scheme for adaptive optimal control of linear stochastic systems. In Proceedings of the American Control Conference (pp. 57–62). Seattle, WA: IEEE.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Stability and monotone convergence of generalised policy iteration for discrete-time linear quadratic regulations

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Stability and monotone convergence of generalised policy iteration for discrete-time linear quadratic regulations

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date