360
Views
13
CrossRef citations to date
0
Altmetric
Original Articles

Stability and monotone convergence of generalised policy iteration for discrete-time linear quadratic regulations

, , &
Pages 437-450 | Received 06 Apr 2015, Accepted 01 Aug 2015, Published online: 07 Sep 2015

References

  • Al-Tamimi, A., Abu-Khalaf, M., & Lewis, F.L. (2007). Model-free Q-learning designs for discrete-time zero-sum games with application to H∞ control. Automatica, 43(3), 473–481.
  • Al-Tamimi, A., Lewis, F.L., & Abu-Khalaf, M. (2008). Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38(4), 943–949.
  • Bertsekas, D.P., & Tsitsiklis, J.N. (1996). Neuro-dynamic programming. Belmont, MA: Athena Scientific.
  • Bradtke, S.J., Ydstie, B.E., & Barto, A.G. (1994). Adaptive linear quadratic control using policy iteration. In Proceedings of the American Control Conference (pp. 3475–3479). Baltimore, MD: IEEE.
  • Brewer, J.W. (1978). Kronecker products and matrix calculus in system theory. IEEE Transactions on Circuits and Systems, 25(9), 772–781.
  • Chun, T.Y., Park, J.B., & Choi, Y.H. (2013). Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems. In Proceedings of the 13th International Conference on Control, Automation and Systems (pp. 454–458). Gwangju: IEEE.
  • Doya, K., Kimura, H., & Kawato, M. (2001). Neural mechanisms for learning and control. IEEE Control System Magazine, 21(4), 42–54.
  • Enns, R., & Si, J. (2003). Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Transactions on Neural Network, 14(4), 929–939.
  • Flankin, G.F., Powell, J.D., & Workman, M.L. (1997). Digital control of dynamic systems. Menlo Park, CA: Addison-Wesley.
  • Hanselmann, T., Noakes, L., & Zaknich, A. (2007). Continuous-time adaptive critics. IEEE Transactions on Neural Network, 18(3), 631–647.
  • Kaelbling, L.P., Littman, M.L., & Moore, A.W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
  • Khalil, H.K. (2002). Nonlinear systems. (3rd ed.). Upper Saddle River, NJ: Prentice Hall.
  • Kleinman, D. (1968). On the iterative technique for Riccati equation computations. IEEE Transactions on Automatic Control, 13(1), 114–115.
  • Lancaster, P., & Rodman, L. (1995). Algebraic Riccati equations. Oxford: Oxford University Press.
  • Landelius, T. (1997). Reinforcement learning and distributed local model synthesis (PhD dissertation). Linkoping University, Sweden.
  • Lee, J.Y., Park, J.B., & Choi, Y.H. (2010). A novel generalized value iteration scheme for uncertain continuous-time linear systems. In Proceedings of the 49th IEEE Conference on Decision and Control (pp. 4637–4642). Atlanta, GA: IEEE.
  • Lee, J.Y., Park, J.B., & Choi, Y.H. (2012). Approximate dynamic programming for continuous-time linear quadratic regulator problems: Relaxation of known input coupling matrix assumption. IET Control Theory and Applications, 6(13), 2063–2075.
  • Lee, J.Y., Park, J.B., & Choi, Y.H. (2014). On integral generalized policy iteration for continuous-time linear quadratic regulations. Automatica, 50(2), 475–489.
  • Lewis, F.L., & Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and System Magazine, 9(3), 32–50.
  • Powell, W.B. (2007). Approximate dynamic programming: Solving the curses of dimensionality. Hoboken, NJ: John Wiley & Sons.
  • Prokhorov, D.V., & Wunsch, D.C. (1997). Adaptive critic designs. IEEE Transactions on Neural Network, 8(5), 997–1007.
  • Puterman, M.L., & Shin, M.C. (1978). Modified policy iteration algorithms for discounted Markov decision problems. Management Science, 24(11), 1127–1137.
  • Si, J., Barto, A.G., & Powell, W.B. (2004). Handbook of learning and approximate dynamic programming. New York, NY: Wiley-IEEE Press.
  • Si, J., Yang, L., & Liu, D. (2012). Direct neural dynamic programming. In J. Si, A.G. Barto, W.B. Powell, & D. Wunsch (Eds.), Handbook of learning and approximate dynamic programming (pp. 125–151). New York, NY: Wiley-IEEE Press.
  • Strang, G. (2006). Linear algebra and its applications. CA: Thomson Higher Education.
  • Sutton, R.S., & Barto, A.G. (1998). Reinforcement learning – an introduction. Cambridge, MA: MIT Press.
  • Vrabie, D. (2010). Online adaptive optimal control for continuous-time systems (PhD thesis). University of Texas at Arlington, TX.
  • Vrabie, D., Vamvoudakis, K., & Lewis, F.L. (2009). Adaptive optimal controllers based on generalized policy iteration in a continuous-time framework. In Proceedings of the 17th Mediterranean Conference on Control and Automation (pp. 1402–1409). Thessaloniki: IEEE.
  • Werbos, P.J. (1992). Approximate dynamic programming for real-time control and neural modelling. In D.A. White & D.A. Sofge (Eds.), Handbook of intelligent control. (pp. 493–525). New York, NY: Van Nostrand Reinhold.
  • Wong, W.C., & Lee, J.H. (2008). A reinforcement learning-based scheme for adaptive optimal control of linear stochastic systems. In Proceedings of the American Control Conference (pp. 57–62). Seattle, WA: IEEE.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.