1,161
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Multi-agent reinforcement learning to unify order-matching and vehicle-repositioning in ride-hailing services

, ORCID Icon, , ORCID Icon, , & show all
Pages 380-402 | Received 21 Feb 2022, Accepted 27 Aug 2022, Published online: 07 Sep 2022

References

  • Arulkumaran, K., et al., 2017. Deep reinforcement learning: a brief survey. IEEE Signal Processing Magazine, 34 (6), 26–38.
  • Bellman, R., 1957. A markovian decision process. Indiana University Mathematics Journal, 6 (4), 679–684.
  • Foerster, J., et al., 2016. Learning to communicate with deep multi-agent reinforcement learning. In: Advances in neural information processing systems. La Jolla, CA: Curran Associates, Inc, vol. 29.
  • Foerster, J., et al., 2017. Stabilising experience replay for deep multi-agent reinforcement learning. In: Proceedings of the 34th international conference on machine learning. PMLR, 1146–1155.
  • Foerster, J., et al., 2018. Counterfactual multi-agent policy gradients. Proceedings of the AAAI Conference on Artificial Intelligence, 32 (1), 2974–1982.
  • Gao, Y., Jiang, D., and Xu, Y., 2018. Optimize taxi driving strategies based on reinforcement learning. International Journal of Geographical Information Science, 32 (8), 1677–1696.
  • Gupta, J.K., Egorov, M., and Kochenderfer, M., 2017. Cooperative multi-agent control using deep reinforcement learning. In: International conference on autonomous agents and multiagent systems. Cham: Springer International Publishing, 66–83.
  • Hernandez-Leal, P., Kartal, B., and Taylor, M.E., 2018. Is multiagent deep reinforcement learning the answer or the question? A brief survey. Learning, 21, 22.
  • Holler, J., et al., 2019. Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem. In: 2019 IEEE international conference on data mining. New York: IEEE, 1090–1095.
  • Jiang, J., and Lu, Z., 2018. Learning attentional communication for multi-agent cooperation. In: S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi and R. Garnett, eds. Advances in neural information processing systems. La Jolla, CA: Curran Associates, Inc, vol. 31.
  • Kingma, D.P., and Ba, J., 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  • Lillicrap, T.P., et al., 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.
  • Li, H., and Shen, I.F., 2006. Similarity measure for vector field learning. In: International symposium on neural networks. Cham: Springer, 436–441.
  • Littman, M.L., 1994. Markov games as a framework for multi-agent reinforcement learning. In: Machine learning proceedings 1994. San Francisco, CA: Morgan Kaufmann, 157–163.
  • Liu, C., Chen, C.X., and Chen, C., 2021. META: A city-wide taxi repositioning framework based on multi-agent reinforcement learning. IEEE Transactions on Intelligent Transportation Systems, 23, 13890–13895.
  • Liu, Z., Li, J., and Wu, K., 2020. Context-aware taxi dispatching at city-scale using deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems, 23, 1996–2009.
  • Li, M., et al., 2019. Efficient ridesharing order dispatching with mean field multi-agent reinforcement learning. In: The world wide web conference. New York: Association for Computing Machinery, 983–994.
  • Lin, K., et al., 2018. Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. New York: Association for Computing Machinery, 1774–1783.
  • Lowe, R., et al., 2017. Multi-agent actor-critic for mixed cooperative-competitive environments. In: I. Guyon, U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds. Advances in Neural Information Processing Systems. La Jolla, CA: Curran Associates, Inc., vol. 30.
  • Luong, M.T., Pham, H., and Manning, C.D., 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025.
  • Mao, C., Liu, Y., and Shen, Z.J.M., 2020. Dispatch of autonomous vehicles for taxi services: a deep reinforcement learning approach. Transportation Research Part C: Emerging Technologies, 115, 102626.
  • Mnih, V., et al., 2015. Human-level control through deep reinforcement learning. Nature, 518 (7540), 529–533.
  • Munkres, J., 1957. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics, 5 (1), 32–38.
  • Qin, G., et al., 2021a. Optimizing matching time intervals for ride-hailing services using reinforcement learning. Transportation Research Part C: Emerging Technologies, 129, 103239.
  • Qin, Z.T., Zhu, H., and Ye, J., 2021b. Reinforcement learning for ridesharing: A survey. In: 2021 IEEE international intelligent transportation systems conference (ITSC). New York: IEEE, 2447–2454.
  • Roughgarden, T., 2005. Selfish routing and the price of anarchy. Cambridge, MA: MIT Press.
  • Schulman, J., et al., 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
  • Sukhbaatar, S., Fergus, R., et al., 2016. Learning multiagent communication with backpropagation. Advances in Neural Information Processing Systems, 29, 2244–2252.
  • Sutton, R.S., Precup, D., and Singh, S., 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112 (1–2), 181–211.
  • Tampuu, A., et al., 2017. Multiagent cooperation and competition with deep reinforcement learning. PLoS One, 12 (4), e0172395.
  • Tang, X., et al., 2019. A deep value-network based approach for multi-driver order dispatching. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. New York: Association for Computing Machinery, 1780–1790.
  • Wang, Z., et al., 2016. Sample efficient actor-critic with experience replay. arXiv preprint arXiv:1611.01224.
  • Wang, Z., et al., 2018. Deep reinforcement learning with knowledge transfer for online rides order dispatching. In: 2018 IEEE international conference on data mining. New York: IEEE, 617–626.
  • Xu, Z., et al., 2018. Large-scale order dispatch in on-demand ride-hailing platforms: a learning and planning approach. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. New York: Association for Computing Machinery, 905–913.
  • Yu, C., et al., 2021. The surprising effectiveness of PPO in Cooperative, Multi-Agent games. arXiv preprint arXiv:2103.01955.
  • Zhou, M., et al., 2019. Multi-agent reinforcement learning for order-dispatching via order-vehicle distribution matching. In: Proceedings of the 28th ACM international conference on information and knowledge management. New York: Association for Computing Machinery, 2645–2653.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.