References
- A, M., R. Hado P van Hasselt, and R. S. Sutton. 2014. Weighted importance sampling for off-policy learning with linear function approximation. Neural Information Processing Systems. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2014/hash/be53ee61104935234b174e62a07e53cf-Abstract.html
- Bellman, R. 1957. A markovian decision process. Indiana University Mathematics Journal 6 (4):679–2339. doi:10.1512/iumj.1957.6.56038.
- Erke, S., D. Bin, N. Yiming, Q. Zhu, X. Liang, and Z. Dawei. 2020. An improved a-star based path planning algorithm for autonomous land vehicles. International Journal of Advanced Robotic Systems 17 (5):172988142096226. doi:10.1177/1729881420962263.
- Falanga, D., K. Kleber, and D. Scaramuzza. 2020. Dynamic obstacle avoidance for quadrotors with event cameras. Science Robotics 5 (40):eaaz9712. doi:10.1126/scirobotics.aaz9712.
- Fujimoto, S., H. van, and D. Meger. 2018. Addressing function approximation error in actor-critic methods. ArXiv organization. https://arxiv.org/abs/1802.09477
- Hajdu, C., and Á. Ballagi. 2020. Proposal of a graph-based motion planner architecture. 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom) Online on MaxWhere 3D Web. September 1, 2020. doi:10.1109/CogInfoCom50765.2020.9237891.
- Hou, Y., and Y. Zhang. 2019. Improving DDPG via prioritized experience replay. https://cardwing.github.io/files/RL_course_report.pdf
- Kam, H. R., S.-H. Lee, T. Park, and C.-H. Kim. 2015. RViz: A toolkit for real domain data visualization. Telecommunication Systems 60 (2):337–45. doi:10.1007/s11235-015-0034-5.
- Kapturowski, S., G. Ostrovski, J. Quan, R. Munos, and W. Dabney. 2018. Recurrent experience replay in distributed reinforcement learning. Openreview.net. September 27, 2018. https://openreview.net/forum?id=r1lyTjAqYX
- Koenig, N., and A. Howard. 2004 Design and use paradigms for Gazebo, an open-source multi-robot simulator. 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE CatNo.04CH37566) Sendai, Japan. Accessed October 20, 2019. doi:10.1109/iros.2004.1389727.
- Kolar, P., P. Benavidez, and M. Jamshidi. 2020. Survey of datafusion techniques for laser and vision based sensor integration for autonomous navigation. Sensors 20 (8):2180. doi:10.3390/s20082180.
- Konda, V. R., and J. N. Tsitsiklis. 2003. OnActor-critic algorithms. SIAM Journal on Control and Optimization 42 (4):1143–66. doi:10.1137/s0363012901385691.
- Lillicrap, T. P., J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. 2015. Continuous control with deep reinforcement learning. ArXiv.org. https://arxiv.org/abs/1509.02971.
- MAHMUD, A. L. 2021. Development of collision resilient drone for flying in cluttered environment. Graduate Theses, Dissertations, and Problem Reports (January). doi:10.33915/etd.8022.
- Mnih, V., K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. 2013. Playing atari with deep reinforcement learning. ArXiv.org. https://arxiv.org/abs/1312.5602.
- Nitta, K., K. Higuchi, Y. Tadokoro, and J. Rekimoto. 2015. Shepherd pass. Proceedings of the 12th International Conference on Advances in Computer Entertainment Technology Iskandar, Malaysia, November. doi:10.1145/2832932.2832950.
- Schaul, T., J. Quan, I. Antonoglou, and D. Silver. 2015. Prioritized experience replay. ArXiv organization. https://arxiv.org/abs/1511.05952.
- Silver, D., G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller. 2014. Deterministic policy gradient algorithms. Proceedings.mlr.press. PMLR Bejing, China. January 27, 2014. https://proceedings.mlr.press/v32/silver14.html.
- Stanford Artificial Intelligence Laboratory et al.2018. Robotic operating system (version ROS Melodic Morenia) https://www.ros.org.
- Sutton, R., and B. En A. G 1999 . Reinforcement learning. Journal of Cognitive Neuroscience 11 01 1999:126–34
- Tan, F., P. Yan, and X. Guan. 2017. Deep reinforcement learning: from Q-learning to deep Q-learning. Neural Information Processing 475–83. doi:10.1007/978-3-319-70093-9_50.
- Youn, W., K. Hayoon, H. Choi, I. Choi, J.-H. Baek, and H. Myung. 2020. Collision-free autonomous navigation of a small UAV using low-cost sensors in GPS-denied environments. International Journal of Control, Automation, and Systems 19 (2):953–68. doi:10.1007/s12555-019-0797-7.
- Zhang, L., S. Han, Z. Zhang, L. Lefan, and S. Lü. 2020a. Deep recurrent deterministic policy gradient for physical control. Artificial Neural Networks and Machine Learning –ICANN 2020:257–68. doi:10.1007/978-3-030-61616-8_21.
- Zhang, Q., M. Zhu, L. Zou, L. Ming, and Y. Zhang. 2020b. Learning reward function with matching network for mapless navigation. Sensors 20 (13):3664. doi:10.3390/s20133664.
- Zijian, H., K. Wan, X. Gao, Y. Zhai, and Q. Wang. 2020. Deep reinforcement learning approach with multiple experience pools for UAV’s autonomous motion planning in complex unknown environments. Sensors 20 (7):1890. doi:10.3390/s20071890.