452
Views
17
CrossRef citations to date
0
Altmetric
Original Articles

Nearly data-based optimal control for linear discrete model-free systems with delays via reinforcement learning

, , &
Pages 1563-1573 | Received 11 Jan 2013, Accepted 15 Sep 2013, Published online: 29 Jul 2014

References

  • Abu-Khalaf, M., & Lewis, F. (2005). Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 41, 779–791.
  • Al-Tamimi, A., Lewis, F., & Abu-Khalaf, M. (2007). Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control. Automatica, 43, 473–481.
  • Basin, M., & Rodriguez-Gonzalez, J. (2006). Optimal control for linear systems with multiple time delays in control input. IEEE Transactions on Automatic Control, 51, 91–97.
  • Bellman, R. (1957). Dynamic programming. Princeton, NJ: Princeton University Press.
  • Bradtke, S., Ydstie, B., & Barto, A. (1994). Adaptive linear quadratic control using policy iteration. In Proceedings of the American Control Conference (Vol. 3, pp. 3475–3479). Baltimore, MD: IEEE.
  • Brewer, J. (1978). Kronecker products and matrix calculus in system theory. IEEE Transactions on Circuits and Systems, 25, 772–781.
  • Cao, N., Zhang, H., Luo, Y., & Feng, D. (2012). Infinite horizon optimal control of affine nonlinear discrete switched systems using two-stage approximate dynamic programming. International Journal of Systems Science, 43, 1673–1682.
  • Cao, Y., & Wu, Q. (1999). Optimization of control parameters in genetic algorithms: A stochastic approach. International Journal of Systems Science, 30, 551–559.
  • Dierks, T., & Jagannathan, S. (2012). Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update. IEEE Transactions on Neural Networks and Learning System, 23, 1118–1129.
  • Dorf, R., & Bishop, R. (2011). Modern control systems (12th ed.). New York: Prentice Hall.
  • Furuta, K., Wongsaisuwan, M., & Werner, H. (1993). Dynamic compensator design for discrete-time LQG problem using Markov parameters. In Proceedings of the 32nd IEEE Conference on Decision and Control (Vol. 1, pp. 96–101). San Antonio, TX: IEEE.
  • Kirk, D. (2004). Optimal control theory – An introduction. New York, NY: Dover.
  • Kolmanovskii, V., & Myshkis, A. (1999). Introduction to the theory and applications of functional differential equations. Dordrecht: Kluwer Academy.
  • Koshkouei, A., Farahi, M., & Burnham, K. (2012). An almost optimal control design method for nonlinear time-delay systems. International Journal of Control, 85, 147–158.
  • Lewis, F., & Vamvoudakis, K. (2011). Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 41, 14–25.
  • Lim, R., Phan, M., & Longman, R. (1998). State estimation with ARMarkov models (Technical Report No.3046). Princeton, NJ: Department of Mechanical and Aerospace Engineering, Princeton University.
  • Márquez-Martínez, L., Moog, C., & Velasco-Villa, M. (2002). Observability and observers for nonlinear systems with time delays. Kybernetika, 38, 445–456.
  • Murry, J.J, Cox, C.J., Lendaris, G.G., & Saeks, R. (2002). Adaptive dynamic programming. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Application and Reviews, 32, 140–153.
  • Niculescu, S. (2001). Delay effects on stability. In Lecture notes in control and information sciences. Berlin: Springer.
  • Prokhorov, D., & Wunsch, D. (1997). Adaptive critic designs. IEEE Transactions on Neural Networks, 8, 997–1007.
  • Richard, J. (2003). Time-delay systems: An overview of some recent advances and open problems. Automatica, 39, 1667–1694.
  • Shi, G., & Skelton, R. (2000). Markov data-based LQG control. Journal of Dynamic Systems, Measurement, and Control, 122, 551–559.
  • Si, J., Barto, A., Powell, W., & Wunsch, D. (2004). Handbook of learning and approximate dynamic programming. New York, NY: Wiley.
  • Song, R., Zhang, H., Luo, Y., & Wei, Q. (2010). Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming. Neurocomputing, 73, 3020–3027.
  • Vamvoudakis, K., & Lewis, F. (2010). Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 46, 878–888.
  • Vrabie, D., & Lewis, F. (2009). Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Networks, 22, 237–246.
  • Wang, D., Liu, D., & Wei, Q. (2012). Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach. Neurocomputing, 78, 14–22.
  • Wang, F., Zhang, H., & Liu, D. (2009). Adaptive dynamic programming: An introduction. IEEE Computational Intelligence Magazine, 4, 39–47.
  • Wang, Y., Sun, X., & Zhao, J. (2012). Asynchronous H∞ control of switched delay systems with average dwell time. Journal of the Franklin Institute, 349, 3159–3169.
  • Wang, Y., Sun, X., & Zhao, J. (2013). Stabilization of a class of switched stochastic systems with time delays under asynchronous switching. Circuits, Systems, and Signal Processing, 32, 347–360.
  • Wei, Q., & Liu, D. (2012). An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state. Neural Networks, 32, 236–244.
  • Wei, Q., Wang, D., & Zhang, D. (2012). Dual iterative adaptive dynamic programming for a class of discrete-time nonlinear systems with time-delays. Neural Computing and Applications, 23, 1851–1863.
  • Wei, Q., Zhang, H., Liu, D., & Zhao, Y. (2010). An optimal control scheme for a class of discrete-time nonlinear systems with time delays using adaptive dynamic programming. Acta Automatica Sinica, 36, 121–129.
  • Werbos, P. (1992). Approximate dynamic programming for real-time control and neural modeling, In Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. New York, NY: Van Nostrand Reinhold.
  • Yang, F., Zhang, H., Liu, Z., & Li, R. (2012). Delay-dependent resilient-robust stabilisation of uncertain networked control systems with variable sampling intervals. International Journal of Systems Science. doi:10.1080/00207721.2012.724101.
  • Zhang, H., Cui, L., Zhang, X., & Luo, Y. (2011). Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Transactions on Neural Networks, 22, 2226–2236.
  • Zhang, H., Luo, Y., & Liu, D. (2009). Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Transactions on Neural Networks, 20, 1490–1503.
  • Zhang, H., Shi, Y., & Liu, M. (2013). H-infty step tracking control for networked discrete-time nonlinear systems with integral and predictive actions. IEEE Transactions on Industrial Informatics, 9, 337–345.
  • Zhang, H., Shi, Y., & Mehr, A.S. (2011). Robust static output feedback control and remote PID design for networked motor systems. IEEE Transactions on Industrial Electronics, 58, 5396–5405.
  • Zhang, H., Shi, Y., & Mehr, A.S. (2012). Robust H-infty PID control for multivariable networked control systems with disturbance/noise attenuation. International Journal of Robust and Nonlinear Control, 22, 183–204.
  • Zhang, H., Song, R., Wei, Q., & Zhang, T. (2011). Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Transaction on Neural Networks, 22, 1851–1862.
  • Zhang, J., Zhang, H., Luo, Y., & Liang, H. (2013). Nearly optimal control scheme using adaptive dynamic programming based on generalized fuzzy hyperbolic model. Acta Automatica Sinica, 39, 142–149.
  • Zhang, J., Zhang, H., Yang, F., & Wang, S. (2013). Robust fault detection filter design for a class of time-delay systems via equivalent transformation. Control Theory and Applications, 11, 54–60.
  • Zhang, X., & Lin, Y. (2013). Adaptive control for a class of nonlinear time-delay systems preceded by unknown hysteresis. International Journal of Systems Science, 44, 1468–1482.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.