Abstract
Taxis spend a large amount of time idle, searching for passengers. The routes vacant taxis should follow in order to minimize their idle times are hard to calculate; they depend on complex effects like passenger demand, traffic conditions, and inter-taxi competition. Here we explore if reinforcement learning (RL) can be used for this purpose. Using real-world data from three major cities, we show RL-taxis can indeed learn to minimize their idle times in different environments. In particular, a single RL-taxi competing with a population of regular taxis learns to out-perform its rivals.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1 We have abused notation slightly here by using to denote the probability of moving to node j when at node i. This is abusive since we earlier defined the policy is term of a state s and action a,
so by
we mean
where
denotes to node j is selected when at node i.