Search in:

Advanced Robotics Volume 38, 2024 - Issue 6: Special Issue on Recent Advance in Nonlinear Robot Control Technology (Part I). Guest editors: Takeshi Hatanak, Yuki Nishimura, Masaaki Nagahara, Satoshi Satoh, Kazuma Sekiguchi, Hyo-Sung Ahn, Jose M. Maestre, Srikant Sukumar, and Gennaro Notomista

Submit an article Journal homepage

136

Views

CrossRef citations to date

Altmetric

Full Papers

Density estimation based soft actor-critic: deep reinforcement learning for static output feedback control with measurement noise

Ran WangGraduate School of Informatics, Kyoto University, Kyoto, JapanCorrespondence[email protected]

https://orcid.org/0000-0002-2521-8589 View further author information

Ye TianGraduate School of Informatics, Kyoto University, Kyoto, Japan

https://orcid.org/0000-0003-4965-2459 View further author information

Kenji KashimaGraduate School of Informatics, Kyoto University, Kyoto, Japan

https://orcid.org/0000-0002-2963-2584 View further author information

Pages 398-409 | Received 25 Sep 2023, Accepted 16 Jan 2024, Published online: 07 Feb 2024

Cite this article
https://doi.org/10.1080/01691864.2024.2309621
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Matsuo Y, LeCun Y, Sahani M, et al. Deep learning, reinforcement learning, and world models. Neural Netw. 2022;152:267–275. doi: 10.1016/j.neunet.2022.03.037
PubMed Web of Science ®Google Scholar
Sutton RS, Barto AG. Reinforcement learning: an introduction. Cambridge (MA): MIT Press; 2018.
Google Scholar
Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning. Preprint, arXiv:150902971. 2015.
Google Scholar
Fujimoto S, Hoof H, Meger D. Addressing function approximation error in actor-critic methods. International Conference on Machine Learning; 2018; p. 1582–1591.
Google Scholar
Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms. Preprint, arXiv:170706347. 2017.
Google Scholar
Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. International Conference on Machine Learning; PMLR; 2018; p. 1861–1870.
Google Scholar
Liu Q, Chung A, Szepesvári C, et al. When is partially observable reinforcement learning not scary?. Conference on Learning Theory; PMLR; 2022; p. 5175–5220.
Google Scholar
Egorov M, Sunberg ZN, Balaban E, et al. POMDPs.jl: a framework for sequential decision making under uncertainty. J Mach Learn Res. 2017;18(26):1–5. Available from: http://jmlr.org/papers/v18/16-300.html.
Google Scholar
Sunberg Z, Kochenderfer M. Online algorithms for POMDPs with continuous state, action, and observation spaces. Proceedings of the International Conference on Automated Planning and Scheduling; 2018; Vol. 28. p. 259–263.
Google Scholar
Takakura S, Sato K. Structured output feedback control for linear quadratic regulator using policy gradient method. IEEE Transactions on Automatic Control; 2023.
Google Scholar
Neto HC, Trindade MA. Control of drill string torsional vibrations using optimal static output feedback. Control Eng Pract. 2023;130:105366. doi: 10.1016/j.conengprac.2022.105366
Web of Science ®Google Scholar
Chen C, Xie L, Xie K, et al. Adaptive optimal output tracking of continuous-time systems via output-feedback-based reinforcement learning. Automatica. 2022;146:110581. doi: 10.1016/j.automatica.2022.110581
Web of Science ®Google Scholar
Fatkhullin I, Polyak B. Optimizing static linear feedback: Gradient method. SIAM J Control Optim. 2021;59(5):3887–3911. doi: 10.1137/20M1329858
Web of Science ®Google Scholar
Veseláżş V. Static output feedback controller design. Kybernetika. 2001;37(2):205–221.
Web of Science ®Google Scholar
Zhang H, Chen H, Xiao C, et al. Robust deep reinforcement learning against adversarial perturbations on state observations. Adv Neural Inf Process Syst. 2020;33:21024–21037.
Google Scholar
Vlassis N, Littman ML, Barber D. On the computational complexity of stochastic controller optimization in POMDPs. ACM Trans Comput Theory (TOCT). 2012;4(4):1–8. doi: 10.1145/2382559.2382563
Google Scholar
Pishro-Nik H. Introduction to probability, statistics and random processes. Cambridge (MA): Kappa Research, LLC; 2014.
Google Scholar
Wang Z, Scott DW. Nonparametric density estimation for high-dimensional data–algorithms and applications. Wiley Interdiscip Rev Comput Stat. 2019;11(4):e1461. doi: 10.1002/wics.2019.11.issue-4
Google Scholar
Chen YC. A tutorial on kernel density estimation and recent advances. Biostat Epidemiol. 2017;1(1):161–187. doi: 10.1080/24709360.2017.1396742
Google Scholar
Papamakarios G, Pavlakou T, Murray I. Masked autoregressive flow for density estimation. Adv Neural Inf Process Syst. 2017;30:2338–2347.
Google Scholar
Germain M, Gregor K, Murray I, et al. Made: Masked autoencoder for distribution estimation. International Conference on Machine Learning; PMLR; 2015; p. 881–889.
Google Scholar
Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. International Conference on Machine Learning; PMLR; 2015; p. 448–456.
Google Scholar
Sutton RS, McAllester D, Singh S, et al. Policy gradient methods for reinforcement learning with function approximation. In: Solla S, Leen T, Müller K, editors. Advances in Neural Information Processing Systems; Vol. 12. Cambridge (MA): MIT Press; 1999.
Google Scholar
Silver D, Lever G, Heess N, et al. Deterministic policy gradient algorithms. International Conference on Machine Learning; PMLR; 2014; p. 387–395.
Google Scholar
Haarnoja T, Zhou A, Hartikainen K, et al. Soft actor-critic algorithms and applications. Preprint, arXiv:181205905. 2018.
Google Scholar
Gu S, Holly E, Lillicrap TP, et al. Deep reinforcement learning for robotic manipulation. Preprint, arXiv:161000633. 2016;1:1.
Google Scholar
Rusu AA, Večerík M, Rothörl T, et al. Sim-to-real robot learning from pixels with progressive nets. Conference on Robot Learning; PMLR; 2017; p. 262–270.
Google Scholar
Andrychowicz OM, Baker B, Chociej M, et al. Learning dexterous in-hand manipulation. Int J Rob Res. 2020;39(1):3–20. doi: 10.1177/0278364919887447
Web of Science ®Google Scholar
Towers M, Terry JK, Kwiatkowski A, et al. Gymnasium. Zenodo; 2023. doi: 10.5281/zenodo.8127026
Google Scholar
Coumans E, Bai Y. Pybullet, a python module for physics simulation for games, robotics and machine learning [http://pybullet.org]; 2016–2021.
Google Scholar
Zhong J, Gupta A, Power T. Um-arm-lab/pytorch_kinematics: v0.5.4 ; 2023.
Google Scholar
Raffin A, Hill A, Gleave A, et al. Stable-baselines3: reliable reinforcement learning implementations. J Mach Learn Res. 2021;22(268):1–8. Available at http://jmlr.org/papers/v22/20-1364.html.
Google Scholar
Raffin A, Kober J, Stulp F. Smooth exploration for robotic reinforcement learning. Conference on Robot Learning; PMLR; 2022; p. 1634–1644.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Density estimation based soft actor-critic: deep reinforcement learning for static output feedback control with measurement noise

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Density estimation based soft actor-critic: deep reinforcement learning for static output feedback control with measurement noise

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date