Abstract
The airline industry strives to maximize the revenue obtained from the sale of tickets on every flight. This is referred to as revenue management and it forms a crucial aspect of airline logistics. Ticket pricing, seat or discount allocation, and overbooking are some of the important aspects of a revenue management problem. Though ticket pricing is usually heavily influenced by factors beyond the control of an airline company, significant amount of control can be exercised over the seat allocation and the overbooking aspects. A realistic model for a single leg of a flight should consider multiple fare classes, overbooking of the flight, concurrent demand arrivals of passengers from the different fare classes, and class-dependent, random cancellations. Accommodating all these factors in one optimization model is a challenging task because that makes it a very large-scale stochastic optimization problem. Almost all papers in the existing literature either accommodate only a subset of these factors or use a discrete approximation in order to make the model tractable. We consider all these factors and cast the single leg problem as a semi-Markov Decision Problem (SMDP) under the average reward optimizing criterion over an infinite time horizon. We solve it using a stochastic optimization technique called Reinforcement Learning. Not only is Reinforcement Learning able to scale up to a huge state-space but because it is simulation-based it can also handle complex modeling assumptions such as the ones mentioned above. The state-space of the numerical test problem scenarios considered here is non-denumerable; its countable part being of the order of 109. Our solution procedure involves a multi-step extension of the SMART algorithm which is based on the one-step Bellman equation. Numerical results presented here show that our approach is able to outperform a heuristic, namely the nested version of the EMSR heuristic, which is widely used in the airline industry. We also present a detailed study of the sensitivity of some modeling parameters via a full factorial experiment.