Browse
We’re here to help

Find guidance on Author Services

Search
Browse
We’re here to help

Find guidance on Author Services

Home
All Journals
Advanced Robotics
List of Issues
Volume 38, Issue 6
Density estimation based soft actor-crit ....

Search in:

Advanced Robotics Volume 38, 2024 - Issue 6: Special Issue on Recent Advance in Nonlinear Robot Control Technology (Part I). Guest editors: Takeshi Hatanak, Yuki Nishimura, Masaaki Nagahara, Satoshi Satoh, Kazuma Sekiguchi, Hyo-Sung Ahn, Jose M. Maestre, Srikant Sukumar, and Gennaro Notomista

Submit an article Journal homepage

136

Views

CrossRef citations to date

Altmetric

Full Papers

Density estimation based soft actor-critic: deep reinforcement learning for static output feedback control with measurement noise

Ran WangGraduate School of Informatics, Kyoto University, Kyoto, JapanCorrespondence[email protected]

https://orcid.org/0000-0002-2521-8589 View further author information

Ye TianGraduate School of Informatics, Kyoto University, Kyoto, Japan

https://orcid.org/0000-0003-4965-2459 View further author information

Kenji KashimaGraduate School of Informatics, Kyoto University, Kyoto, Japan

https://orcid.org/0000-0002-2963-2584 View further author information

Pages 398-409 | Received 25 Sep 2023, Accepted 16 Jan 2024, Published online: 07 Feb 2024

Cite this article
https://doi.org/10.1080/01691864.2024.2309621
CrossMark

Sample our Engineering & Technology journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/01691864.2024.2309621?needAccess=true

ABSTRACT

The state-of-the-art deep reinforcement learning (DRL) methods, including Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), Proximal Policy Optimization (PPO), Soft Actor-Critic (SAC), among others, demonstrate significant capability in solving the optimal static state feedback control (SSFC) problem. This problem can be modeled as a fully observed Markov decision process (MDP). However, the optimal static output feedback control (SOFC) problem with measurement noise is a typical partially observable MDP (POMDP), which is difficult to solve, especially for the continuous state-action-observation space with high dimensions. This paper proposes a two-stage framework to address this challenge. In the laboratory stage, both the states and the noisy outputs are observable; the SOFC policy is converted to a constrained stochastic SSFC policy, of which the probability density function is generally not analytical. To this end, a density estimation based SAC algorithm is proposed to explore the optimal SOFC policy by learning the optimal constrained stochastic SSFC. Consequently, in the real-world stage, only the noisy outputs and the learned SOFC policy are required to solve the optimal SOFC problem. Numerical simulations and the corresponding experiments with robotic arms are provided to illustrate the effectiveness of our method. The code is available at https://github.com/RanKyoto/DE-SAC.

GRAPHICAL ABSTRACT

KEYWORDS:

Deep reinforcement learning
static output feedback control
density estimation
soft actor-critic
robotic arm

Disclosure statement

No potential conflict of interest was reported by theauthor(s).

Notes

1 https://github.com/JuliaPOMDP/POMDPs.jl.

2 dlqe $()$ and dlqr $()$ are the solvers of a linear-quadratic state estimator (LQE) and a linear-quadratic regulator (LQR) for discrete-time linear control systems, respectively. They can be found in Matlab or Python Control Systems Library.

3 https://www.elephantrobotics.com/en/mecharm-cn/.

Additional information

Funding

This work was supported in part by JSPS KAKENHI under Grant Number JP21H04875.

Notes on contributors

Ran Wang

Ran Wang received the B.E. degree in Automation and the M.E. degree in Control Theory and Control Engineering from Dalian University of Technology, Dalian, China, in 2016 and 2019, respectively. He is currently pursuing the Ph.D. degree with Kyoto University, Kyoto, Japan. His research interests include reinforcement learning and self-triggered control.

Ye Tian

Ye Tian received his B.S. degree in applied mathematics from Ningxia University, Yinchuan, China, in 2014, and the Ph.D. degree in control theory and control engineering from Xidian University, Xi'an, China, in 2021. From 2017 to 2019, he was a visiting student at the Center for Control, Dynamical-systems and Computation, University of California, Santa Barbara, CA, USA. He is currently a Postdoctoral Fellow with the Graduate School of Informatics, Kyoto University, Kyoto, Japan. His research interests include multi-agent systems, game theory, and social networks.

Kenji Kashima

Kenji Kashima received his Doctoral degree in Informatics from Kyoto University in 2005. He was with Tokyo Institute of Technology, Universität Stuttgart, Osaka University, before he joined Kyoto University in 2013, where he is currently an Associate Professor. His research interests include control and learning theory for complex (large scale, stochastic, networked) dynamical systems, as well as its interdisciplinary applications. He received Humboldt Research Fellowship (Germany), IEEE CSS Roberto Tempo Best CDC Paper Award, Pioneer Award of SICE Control Division, and so on. He is an Associate Editor of IEEE Transactions of Automatic Control ( 2017–), the IEEE CSS Conference Editorial Board ( 2011–) and Asian Journal of Control ( 2014–).

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 61.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 332.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Share icon
Back to Top

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Information for

Authors
R&D professionals
Editors
Librarians
Societies

Open access

Overview
Open journals
Open Select
Dove Medical Press
F1000Research

Opportunities

Reprints and e-prints
Advertising solutions
Accelerated publication
Corporate access solutions

Help and information

Help and contact
Newsroom
All journals
Books

Keep up to date

Sign me up

Taylor and Francis Group Facebook page

Taylor and Francis Group X Twitter page

Taylor and Francis Group Linkedin page

Taylor and Francis Group Youtube page

Taylor and Francis Group Weibo page

Registered in England & Wales No. 3099067
5 Howick Place | London | SW1P 1WG

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research