References
- Al-Tamimi, A., Abu-Khalaf, M., & Lewis, F.L. (2007). Model-free Q-learning designs for discrete-time zero-sum games with application to H∞ control. Automatica, 43(3), 473–481.
- Al-Tamimi, A., Lewis, F.L., & Abu-Khalaf, M. (2008). Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38(4), 943–949.
- Bertsekas, D.P., & Tsitsiklis, J.N. (1996). Neuro-dynamic programming. Belmont, MA: Athena Scientific.
- Bradtke, S.J., Ydstie, B.E., & Barto, A.G. (1994). Adaptive linear quadratic control using policy iteration. In Proceedings of the American Control Conference (pp. 3475–3479). Baltimore, MD: IEEE.
- Brewer, J.W. (1978). Kronecker products and matrix calculus in system theory. IEEE Transactions on Circuits and Systems, 25(9), 772–781.
- Chun, T.Y., Park, J.B., & Choi, Y.H. (2013). Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems. In Proceedings of the 13th International Conference on Control, Automation and Systems (pp. 454–458). Gwangju: IEEE.
- Doya, K., Kimura, H., & Kawato, M. (2001). Neural mechanisms for learning and control. IEEE Control System Magazine, 21(4), 42–54.
- Enns, R., & Si, J. (2003). Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Transactions on Neural Network, 14(4), 929–939.
- Flankin, G.F., Powell, J.D., & Workman, M.L. (1997). Digital control of dynamic systems. Menlo Park, CA: Addison-Wesley.
- Hanselmann, T., Noakes, L., & Zaknich, A. (2007). Continuous-time adaptive critics. IEEE Transactions on Neural Network, 18(3), 631–647.
- Kaelbling, L.P., Littman, M.L., & Moore, A.W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
- Khalil, H.K. (2002). Nonlinear systems. (3rd ed.). Upper Saddle River, NJ: Prentice Hall.
- Kleinman, D. (1968). On the iterative technique for Riccati equation computations. IEEE Transactions on Automatic Control, 13(1), 114–115.
- Lancaster, P., & Rodman, L. (1995). Algebraic Riccati equations. Oxford: Oxford University Press.
- Landelius, T. (1997). Reinforcement learning and distributed local model synthesis (PhD dissertation). Linkoping University, Sweden.
- Lee, J.Y., Park, J.B., & Choi, Y.H. (2010). A novel generalized value iteration scheme for uncertain continuous-time linear systems. In Proceedings of the 49th IEEE Conference on Decision and Control (pp. 4637–4642). Atlanta, GA: IEEE.
- Lee, J.Y., Park, J.B., & Choi, Y.H. (2012). Approximate dynamic programming for continuous-time linear quadratic regulator problems: Relaxation of known input coupling matrix assumption. IET Control Theory and Applications, 6(13), 2063–2075.
- Lee, J.Y., Park, J.B., & Choi, Y.H. (2014). On integral generalized policy iteration for continuous-time linear quadratic regulations. Automatica, 50(2), 475–489.
- Lewis, F.L., & Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and System Magazine, 9(3), 32–50.
- Powell, W.B. (2007). Approximate dynamic programming: Solving the curses of dimensionality. Hoboken, NJ: John Wiley & Sons.
- Prokhorov, D.V., & Wunsch, D.C. (1997). Adaptive critic designs. IEEE Transactions on Neural Network, 8(5), 997–1007.
- Puterman, M.L., & Shin, M.C. (1978). Modified policy iteration algorithms for discounted Markov decision problems. Management Science, 24(11), 1127–1137.
- Si, J., Barto, A.G., & Powell, W.B. (2004). Handbook of learning and approximate dynamic programming. New York, NY: Wiley-IEEE Press.
- Si, J., Yang, L., & Liu, D. (2012). Direct neural dynamic programming. In J. Si, A.G. Barto, W.B. Powell, & D. Wunsch (Eds.), Handbook of learning and approximate dynamic programming (pp. 125–151). New York, NY: Wiley-IEEE Press.
- Strang, G. (2006). Linear algebra and its applications. CA: Thomson Higher Education.
- Sutton, R.S., & Barto, A.G. (1998). Reinforcement learning – an introduction. Cambridge, MA: MIT Press.
- Vrabie, D. (2010). Online adaptive optimal control for continuous-time systems (PhD thesis). University of Texas at Arlington, TX.
- Vrabie, D., Vamvoudakis, K., & Lewis, F.L. (2009). Adaptive optimal controllers based on generalized policy iteration in a continuous-time framework. In Proceedings of the 17th Mediterranean Conference on Control and Automation (pp. 1402–1409). Thessaloniki: IEEE.
- Werbos, P.J. (1992). Approximate dynamic programming for real-time control and neural modelling. In D.A. White & D.A. Sofge (Eds.), Handbook of intelligent control. (pp. 493–525). New York, NY: Van Nostrand Reinhold.
- Wong, W.C., & Lee, J.H. (2008). A reinforcement learning-based scheme for adaptive optimal control of linear stochastic systems. In Proceedings of the American Control Conference (pp. 57–62). Seattle, WA: IEEE.