392
Views
3
CrossRef citations to date
0
Altmetric
Research Article

Bellman's principle of optimality and deep reinforcement learning for time-varying tasks

ORCID Icon & ORCID Icon
Pages 2448-2459 | Received 04 Oct 2020, Accepted 01 Apr 2021, Published online: 16 Apr 2021

Keep up to date with the latest research on this topic with citation updates for this article.

Read on this site (1)

Raffaele Iervolino, Massimo Tipaldi & Ali Forootani. (2023) A Lyapunov-based version of the value iteration algorithm formulated as a discrete-time switched affine system. International Journal of Control 96:3, pages 577-592.
Read now

Articles from other publishers (2)

Junjun Yang, Kaige Tan, Lei Feng, Ahmed M. El-Sherbeeny & Zhiwu Li. (2023) Reducing the Learning Time of Reinforcement Learning for the Supervisory Control of Discrete Event Systems. IEEE Access 11, pages 59840-59853.
Crossref
Xun Huang. (2022) Opponent cart-pole dynamics for reinforcement learning of competing agents面向竞争多智能体增强学习的对抗倒立摆动力学系统. Acta Mechanica Sinica 38:5.
Crossref

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.