280
Views
1
CrossRef citations to date
0
Altmetric
Research Article

A Lyapunov-based version of the value iteration algorithm formulated as a discrete-time switched affine system

ORCID Icon, ORCID Icon & ORCID Icon
Pages 577-592 | Received 17 May 2021, Accepted 05 Nov 2021, Published online: 22 Nov 2021

References

  • Al-Tamimi, A., Lewis, F. L., & Abu-Khalaf, M. (2018). Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 38(4), 943–949. https://doi.org/10.1109/TSMCB.2008.926614
  • Bertsekas, D. P. (2011). Approximate policy iteration: A survey and some new methods. Journal of Control Theory and Applications, 9(3), 310–335. https://doi.org/10.1007/s11768-011-1005-3
  • Bertsekas, D. P. (2012). Dynamic programming and optimal control (Vol. II). Athena Scientific.
  • Bertsekas, D. P. (2019). Feature-based aggregation and deep reinforcement learning: A survey and some new implementations. IEEE/CAA Journal of Automatica Sinica, 6(1), 1–31. https://doi.org/10.1109/JAS.2018.7511249
  • Boyd, S., El Ghaoui, L., Feron, E., & Balakrishnan, V. (1994). Linear matrix inequalities in system and control theory. Society for Industrial and Applied Mathematics (SIAM).
  • Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge University Press.
  • D'Angelo, G., & Palmieri, F. (2021). GGA: A modified genetic algorithm with gradient-based local search for solving constrained optimization problems. Information Sciences, 547, 136–162. https://doi.org/10.1016/j.ins.2020.08.040
  • D'Angelo, G., Tipaldi, M., Palmieri, F., & Glielmo, L. (2019). A data-driven approximate dynamic programming approach based on association rule learning: Spacecraft autonomy as a case study. Information Sciences, 504, 501–519. https://doi.org/10.1016/j.ins.2019.07.067
  • Deaecto, G. S., & Geromel, J. C. (2017). Stability analysis and control design of discrete time switched affine systems. IEEE Transactions on Automatic Control, 62(8), 4058–4065. https://doi.org/10.1109/TAC.2016.2616722
  • Egidio, L. N., & Deaecto, G. S. (2019). Novel practical stability conditions for discrete-time switched affine systems. IEEE Transactions on Automatic Control, 64(11), 4705–4710. https://doi.org/10.1109/TAC.9
  • Forootani, A., Iervolino, R., & Tipaldi, M. (2019). Applying unweighted leastsquares based techniques to stochastic dynamic programming: Theory and application. IET Control Theory & Applications, 13(15), 2387–2398. https://doi.org/10.1049/cth2.v13.15
  • Forootani, A., Iervolino, R., Tipaldi, M., & Nielson, J. (2020). Approximate dynamic programming for stochastic resource allocation problems. IEEE/CAA Journal of Automatica Sinica, 7(4), 975–990. https://doi.org/10.1109/JAS.6570654
  • Forootani, A., Liuzza, D., Tipaldi, M., & Glielmo, L. (2021). Allocating resources via price management systems: A dynamic programming-based approach. International Journal of Control, 94(8), 2123–2143. https://doi.org/10.1080/00207179.2019.1694178
  • Forootani, A., Tipaldi, M., Iervolino, R., & Dey, S. (2022). Enhanced exploration least-squares methods for optimal stopping problems. IEEE Control Systems Letters, 6, 271–276. https://doi.org/10.1109/LCSYS.2021.3069708
  • Forootani, A., Tipaldi, M., Zarch, M. G., Liuzza, D., & Glielmo, L. (2019). Modelling and solving resource allocation problems via a dynamic programming approach. International Journal of Control, 94(6), 1544–1555. https://doi.org/10.1080/00207179.2019.1661521
  • Forootani, A., Tipaldi, M., Zarch, M. G., Liuzza, D., & Glielmo, L. (2020). A least-squares temporal difference based method for solving resource allocation problems. IFAC Journal of Systems and Control, 13, Article 100106. https://doi.org/10.1016/j.ifacsc.2020.100106
  • Giuseppi, A., & Pietrabissa, A. (2021). Bellman's principle of optimality and deep reinforcement learning for time-varying tasks. International Journal of Control, 1–12. https://doi.org/10.1080/00207179.2021.1913516
  • Guo, W., Feng, L., Si, J., He, D., Harley, R. G., & Mei, S. (2016). On-line supplementary ADP learning controller design and application to power system frequency control with large-scale wind energy integration. IEEE Transactions on Neural Networks and Learning Systems, 27(8), 1748–1761. https://doi.org/10.1109/TNNLS.2015.2431734
  • Ha, M., Wang, D., & Liu, D. (2021). Generalized value iteration for discounted optimal control with stability analysis. Systems & Control Letters, 147(1), Article 104847. https://doi.org/10.1016/j.sysconle.2020.104847
  • Kuznetsova, E., Li, Y. F., Ruiz, C., Zio, E., Ault, G., & Bell, K. (2013). Reinforcement learning for microgrid energy management. Energy, 59, 133–146. https://doi.org/10.1016/j.energy.2013.05.060
  • Lin, H., & Antsaklis, P. J. (2009). Stability and stabilizability of switched linear systems: A survey of recent results. IEEE Transactions on Automatic Control, 54(2), 308–322. https://doi.org/10.1109/TAC.2008.2012009
  • Mitchell, T. (1997). Machine learning. McGraw-Hill, Inc.
  • Pietrabissa, A., Delli Priscoli, F., Di Giorgio, A., Giuseppi, A., Panfili, M., & Suraci, V. (2017). An approximate dynamic programming approach to resource management in multi-cloud scenarios. International Journal of Control, 90(3), 492–503. https://doi.org/10.1080/00207179.2016.1185802
  • Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press.
  • Tsitsiklis, J. N. (2003). On the convergence of optimistic policy iteration. Journal of Machine Learning Research, 3, 59–72. https://doi.org/10.1162/153244303768966102
  • Tsitsiklis, J. N., & Roy, B. V. (1997). An analysis of temporal difference learning with function approximation. IEEE Transactions on Automatic Control, 42(5), 674–690. https://doi.org/10.1109/9.580874
  • Xu, X., Zhai, G., & He, S. (2008). On practical asymptotic stabilizability of switched affine systems. Nonlinear Analysis: Hybrid Systems, 2(1), 196–208. https://doi.org/10.1016/j.nahs.2007.07.003
  • Yang, X.-S. (2021). Chapter 6 – Genetic algorithms. In Nature-inspired optimization algorithms (2nd ed., pp. 91–100). Academic Press.
  • Yu, H. (2009). Convergence results for some temporal difference methods based on least squares. IEEE Transactions on Automatic Control, 54(7), 1515–1531. https://doi.org/10.1109/TAC.2009.2022097

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.