A Novel Augmentative Backward Reward Function with Deep Reinforcement Learning for Autonomous UAV Navigation

Manit ChansuparpData Science and Computational Intelligence (DSCI) Laboratory, Department of Computer Science, School of Science, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand

Kulsawasd JitkajornwanichData Science and Computational Intelligence (DSCI) Laboratory, Department of Computer Science, School of Science, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, ThailandCorrespondence[email protected]

https://orcid.org/0000-0002-6926-7577

Article: 2084473 | Received 22 Feb 2022, Accepted 25 May 2022, Published online: 06 Jul 2022

Cite this article
https://doi.org/10.1080/08839514.2022.2084473
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

A, M., R. Hado P van Hasselt, and R. S. Sutton. 2014. Weighted importance sampling for off-policy learning with linear function approximation. Neural Information Processing Systems. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2014/hash/be53ee61104935234b174e62a07e53cf-Abstract.html
Google Scholar
Bellman, R. 1957. A markovian decision process. Indiana University Mathematics Journal 6 (4):679–2339. doi:10.1512/iumj.1957.6.56038.
Google Scholar
Erke, S., D. Bin, N. Yiming, Q. Zhu, X. Liang, and Z. Dawei. 2020. An improved a-star based path planning algorithm for autonomous land vehicles. International Journal of Advanced Robotic Systems 17 (5):172988142096226. doi:10.1177/1729881420962263.
Web of Science ®Google Scholar
Falanga, D., K. Kleber, and D. Scaramuzza. 2020. Dynamic obstacle avoidance for quadrotors with event cameras. Science Robotics 5 (40):eaaz9712. doi:10.1126/scirobotics.aaz9712.
PubMed Web of Science ®Google Scholar
Fujimoto, S., H. van, and D. Meger. 2018. Addressing function approximation error in actor-critic methods. ArXiv organization. https://arxiv.org/abs/1802.09477
Google Scholar
Hajdu, C., and Á. Ballagi. 2020. Proposal of a graph-based motion planner architecture. 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom) Online on MaxWhere 3D Web. September 1, 2020. doi:10.1109/CogInfoCom50765.2020.9237891.
Google Scholar
Hou, Y., and Y. Zhang. 2019. Improving DDPG via prioritized experience replay. https://cardwing.github.io/files/RL_course_report.pdf
Google Scholar
Kam, H. R., S.-H. Lee, T. Park, and C.-H. Kim. 2015. RViz: A toolkit for real domain data visualization. Telecommunication Systems 60 (2):337–45. doi:10.1007/s11235-015-0034-5.
Web of Science ®Google Scholar
Kapturowski, S., G. Ostrovski, J. Quan, R. Munos, and W. Dabney. 2018. Recurrent experience replay in distributed reinforcement learning. Openreview.net. September 27, 2018. https://openreview.net/forum?id=r1lyTjAqYX
Google Scholar
Koenig, N., and A. Howard. 2004 Design and use paradigms for Gazebo, an open-source multi-robot simulator. 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE CatNo.04CH37566) Sendai, Japan. Accessed October 20, 2019. doi:10.1109/iros.2004.1389727.
Google Scholar
Kolar, P., P. Benavidez, and M. Jamshidi. 2020. Survey of datafusion techniques for laser and vision based sensor integration for autonomous navigation. Sensors 20 (8):2180. doi:10.3390/s20082180.
PubMed Web of Science ®Google Scholar
Konda, V. R., and J. N. Tsitsiklis. 2003. OnActor-critic algorithms. SIAM Journal on Control and Optimization 42 (4):1143–66. doi:10.1137/s0363012901385691.
Web of Science ®Google Scholar
Lillicrap, T. P., J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. 2015. Continuous control with deep reinforcement learning. ArXiv.org. https://arxiv.org/abs/1509.02971.
Google Scholar
MAHMUD, A. L. 2021. Development of collision resilient drone for flying in cluttered environment. Graduate Theses, Dissertations, and Problem Reports (January). doi:10.33915/etd.8022.
Google Scholar
Mnih, V., K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. 2013. Playing atari with deep reinforcement learning. ArXiv.org. https://arxiv.org/abs/1312.5602.
Google Scholar
Nitta, K., K. Higuchi, Y. Tadokoro, and J. Rekimoto. 2015. Shepherd pass. Proceedings of the 12th International Conference on Advances in Computer Entertainment Technology Iskandar, Malaysia, November. doi:10.1145/2832932.2832950.
Google Scholar
Schaul, T., J. Quan, I. Antonoglou, and D. Silver. 2015. Prioritized experience replay. ArXiv organization. https://arxiv.org/abs/1511.05952.
Google Scholar
Silver, D., G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller. 2014. Deterministic policy gradient algorithms. Proceedings.mlr.press. PMLR Bejing, China. January 27, 2014. https://proceedings.mlr.press/v32/silver14.html.
Google Scholar
Stanford Artificial Intelligence Laboratory et al.2018. Robotic operating system (version ROS Melodic Morenia) https://www.ros.org.
Google Scholar
Sutton, R., and B. En A. G 1999 . Reinforcement learning. Journal of Cognitive Neuroscience 11 01 1999:126–34
Web of Science ®Google Scholar
Tan, F., P. Yan, and X. Guan. 2017. Deep reinforcement learning: from Q-learning to deep Q-learning. Neural Information Processing 475–83. doi:10.1007/978-3-319-70093-9_50.
Google Scholar
Youn, W., K. Hayoon, H. Choi, I. Choi, J.-H. Baek, and H. Myung. 2020. Collision-free autonomous navigation of a small UAV using low-cost sensors in GPS-denied environments. International Journal of Control, Automation, and Systems 19 (2):953–68. doi:10.1007/s12555-019-0797-7.
Web of Science ®Google Scholar
Zhang, L., S. Han, Z. Zhang, L. Lefan, and S. Lü. 2020a. Deep recurrent deterministic policy gradient for physical control. Artificial Neural Networks and Machine Learning –ICANN 2020:257–68. doi:10.1007/978-3-030-61616-8_21.
Google Scholar
Zhang, Q., M. Zhu, L. Zou, L. Ming, and Y. Zhang. 2020b. Learning reward function with matching network for mapless navigation. Sensors 20 (13):3664. doi:10.3390/s20133664.
PubMed Web of Science ®Google Scholar
Zijian, H., K. Wan, X. Gao, Y. Zhai, and Q. Wang. 2020. Deep reinforcement learning approach with multiple experience pools for UAV’s autonomous motion planning in complex unknown environments. Sensors 20 (7):1890. doi:10.3390/s20071890.
PubMed Web of Science ®Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

A Novel Augmentative Backward Reward Function with Deep Reinforcement Learning for Autonomous UAV Navigation

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

A Novel Augmentative Backward Reward Function with Deep Reinforcement Learning for Autonomous UAV Navigation

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date