1,351
Views
2
CrossRef citations to date
0
Altmetric
Articles

Learning adversarial policy in multiple scenes environment via multi-agent reinforcement learning

, , , , , & show all
Pages 407-426 | Received 06 Aug 2020, Accepted 20 Sep 2020, Published online: 03 Nov 2020

References

  • Bacchiani, G., Molinari, D., & Patander, M. (2019). Microscopic traffic simulation by cooperative multi-agent deep reinforcement learning. In Proceedings of the 18th international conference on autonomous agents and multiagent systems (pp. 1547–1555).
  • Barakova, E. I., De Haas, M., Kuijpers, W., Irigoyen, N., & Betancourt, A. (2018). Socially grounded game strategy enhances bonding and perceived smartness of a humanoid robot. Connection Science, 30(1), 81–98. https://doi.org/10.1080/09540091.2017.1350938
  • Barrett, E., Duggan, J., & Howley, E. (2014). A parallel framework for Bayesian reinforcement learning. Connection Science, 26(1), 7–23. https://doi.org/10.1080/09540091.2014.885268
  • Billings, D., Papp, D., Schaeffer, J., & Szafron, D. (1998). Opponent modeling in poker. In AAAI/IAAI (Vol. 493, p. 499).
  • Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). Openai gym. In arXiv preprint arXiv:1606.01540.
  • Brys, T., Pham, T. T., & Taylor, M. E. (2014). Distributed learning and multi-objectivity in traffic light control. Connection Science, 26(1), 65–83. https://doi.org/10.1080/09540091.2014.885282
  • Dineva, E., & Schöner, G. (2018). How infants' reaches reveal principles of sensorimotor decision making. Connection Science, 30(1), 53–80. https://doi.org/10.1080/09540091.2017.1405382
  • Foerster, J. N., Farquhar, G., Afouras, T., Nardelli, N., & Whiteson, S. (2018). Counterfactual multi-agent policy gradients. In Thirty-second AAAI conference on artificial intelligence.
  • Foerster, J., Nardelli, N., Farquhar, G., Afouras, T., Torr, P. H., Kohli, P., & Whiteson, S. (2017). Stabilising experience replay for deep multi-agent reinforcement learning. In Proceedings of the 34th international conference on machine learning (Vol. 70, pp. 1146–1155). JMLR. org.
  • Gupta, J. K., Egorov, M., & Kochenderfer, M. (2017). Cooperative multi-agent control using deep reinforcement learning. In International conference on autonomous agents and multiagent systems (pp. 66–83). Springer.
  • Hernandez-Leal, P., Kaisers, M., Baarslag, T., & de Cote, E. M. (2017). A survey of learning in multiagent environments: Dealing with stationarity. arXiv preprint arXiv:1707.09183.
  • Hernandez-Leal, P., E. Munoz de Cote, & Sucar, L. E. (2014). A framework for learning and planning against switching strategies in repeated games. Connection Science, 26(2), 103–122. https://doi.org/10.1080/09540091.2014.885294
  • Hu, J., & Wellman, M. P. (1998). Multiagent reinforcement learning: Theoretical framework and an algorithm. In ICML (Vol. 98, pp. 242–250). Citeseer.
  • Jaderberg, M., Czarnecki, W. M., Dunning, I., Marris, L., Lever, G., & A. G. Castaneda (2019). Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science, 364(6443), 859–865. https://doi.org/10.1126/science.aau6249
  • Juliani, A., Berges, V. P., Vckay, E., Gao, Y., Henry, H., Mattar, M., & Lange, D. (2018). Unity: A general platform for intelligent agents. arXiv preprint arXiv:1809.02627.
  • Kheyrinataj, F., & Nazemi, A. (2020). Fractional power series neural network for solving delay fractional optimal control problems. Connection Science, 32(1), 53–80. https://doi.org/10.1080/09540091.2019.1605498
  • Lauer, M., & Riedmiller, M. (2000). An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In Proceedings of the seventeenth international conference on machine learning. Citeseer.
  • Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994 (pp. 157–163). Elsevier.
  • Liu, Y., Wang, W., Hu, Y., Hao, J., Chen, X., & Gao, Y. (2020). Multi-agent game abstraction via graph attention neural network. In AAAI (pp. 7211–7218).
  • Long, P., Fanl, T., Liao, X., Liu, W., Zhang, H., & Pan, J. (2018). Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 6252–6259).
  • Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O. P., & Mordatch, I. (2017). Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in neural information processing systems (pp. 6379–6390).
  • Mao, H., Zhang, Z., Xiao, Z., & Gong, Z. (2019). Modelling the dynamic joint policy of teammates with attention multi-agent DDPG. In Proceedings of the 18th international conference on autonomous agents and multiagent systems (pp. 1108–1116). International Foundation for Autonomous Agents and Multiagent Systems.
  • Matignon, L., Jeanpierre, L., & Mouaddib, A. I. (2012). Coordinated multi-robot exploration under communication constraints using decentralized markov decision processes. In Twenty-sixth AAAI conference on artificial intelligence.
  • Matignon, L., Laurent, G. J., & Le Fort-Piat, N. (2007). Hysteretic q-learning: An algorithm for decentralized reinforcement learning in cooperative multi-agent teams. In 2007 IEEE/RSJ international conference on intelligent robots and systems (pp. 64–69). IEEE.
  • Matignon, L., Laurent, G. J., & Le Fort-Piat, N. (2012). Independent reinforcement learners in cooperative markov games: A survey regarding coordination problems. The Knowledge Engineering Review, 27(1), 1–31. https://doi.org/10.1017/S0269888912000057
  • Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., & Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp. 1928–1937).
  • Movric, K. H., & Lewis, F. L. (2013). Cooperative optimal control for multi-agent systems on directed graph topologies. IEEE Transactions on Automatic Control, 59(3), 769–774. https://doi.org/10.1109/TAC.2013.2275670
  • Nash, J. F. (1950). Equilibrium points in n-person games. Proceedings of the National Academy of Sciences, 36(1), 48–49. https://doi.org/10.1073/pnas.36.1.48
  • Nazemi, A., Fayyazi, E., & Mortezaee, M. (2019). Solving optimal control problems of the time-delayed systems by a neural network framework. Connection Science, 31(4), 342–372. https://doi.org/10.1080/09540091.2019.1604627
  • Omidshafiei, S., Pazis, J., Amato, C., How, J. P., & Vian, J. (2017). Deep decentralized multi-task multi-agent reinforcement learning under partial observability. In Proceedings of the 34th international conference on machine learning (Vol. 70, pp. 2681–2690). JMLR. org.
  • Peng, P., Yuan, Q., Wen, Y., Yang, Y., Tang, Z., Long, H., & Wang, J. (2017). Multiagent bidirectionally-coordinated nets for learning to play starcraft combat games. arXiv preprint arXiv:1703.10069 2.
  • Raboin, E., Vec, P., D. S. Nau, & Gupta, S. K. (2015). Model-predictive asset guarding by team of autonomous surface vehicles in environment with civilian boats. Autonomous Robots, 38(3), 261–282. https://doi.org/10.1007/s10514-014-9409-9
  • Ryu, H., Shin, H., & Park, J. (2020). Multi-agent actor-critic with hierarchical graph attention network. In AAAI (pp. 7236–7243).
  • Sartoretti, G., Paivine, W., Shi, Y., Wu, Y., & Choset, H. (2019). Distributed learning of decentralized control policies for articulated mobile robots. IEEE Transactions on Robotics, 35(5), 1109–1122. https://doi.org/10.1109/TRO.8860
  • Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
  • Semnani, S. H., Liu, H., Everett, M., de Ruiter, A., & How, J. P. (2020). Multi-agent motion planning for dense and dynamic environments via deep reinforcement learning. IEEE Robotics and Automation Letters, 5(2), 3221–3226. https://doi.org/10.1109/LSP.2016.
  • Singh, S., Kearns, M., & Mansour, Y. (2000). Nash convergence of gradient dynamics in general-sum games. In Proceedings of the sixteenth conference on uncertainty in artificial intelligence (pp. 541–548). Morgan Kaufmann Publishers Inc.
  • Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
  • Tao, N., Baxter, J., & Weaver, L. (2001). A multi-agent, policy-gradient approach to network routing. In Proceedings of the 18th international conference on machine learning. Citeseer.
  • Tesauro, G. (2004). Extending Q-learning to general adaptive multi-agent systems. In Advances in neural information processing systems (pp. 871–878).
  • Wang, D., Fan, T., Han, T., & Pan, J. (2020). A two-stage reinforcement learning approach for multi-UAV collision avoidance under imperfect sensing. IEEE Robotics and Automation Letters, 5(2), 3098–3105. https://doi.org/10.1109/LSP.2016.
  • Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3–4), 279–292. https://doi.org/10.1023/A:1022676722315
  • Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3–4), 229–256. https://doi.org/10.1007/BF00992696
  • Yu, C., Wang, X., Hao, J., & Feng, Z. (2019). Reinforcement learning for cooperative overtaking. In Proceedings of the 18th international conference on autonomous agents and multiagent systems (pp. 341–349).
  • Zhang, C., & Lesser, V. (2010). Multi-agent learning with policy prediction. In Twenty-fourth AAAI conference on artificial intelligence.
  • Zhang, Z., Luo, X., Liu, T., Xie, S., Wang, J., Wang, W., & Peng, Y. (2019). Proximal policy optimization with mixed distributed training. In 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI) (pp. 1452–1456).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.