Abstract
Suppose that one of two prices for the same product must be posted every day. Under each price, the demand function is described by a compound Poisson process with possibly unknown parameters. The objective is to sequentially post daily prices so as to maximize the total expected, possibly discounted gross revenue over a finite pricing horizon. To effectively balance between understanding the demand function and achieving economic revenues, we formulate the optimal pricing problem with a bandit model and characterize the solution by means of stochastic dynamic programming. When there is only one unknown demand function in the model, the optimal pricing decision is determined by a pricing index, whose limit is the Gittins index. These index values also demonstrate that it may be worth sacrificing some immediate payoff for the benefit of information gathering and better-informed decisions in the future. Moreover, the optimal stopping solution is derived and the myopic strategy is shown not to be optimal in general. When both demand functions are unknown, a version of the play-the-winner pricing rule is derived.
ACKNOWLEDGMENTS
The author thanks the Editor, an Associate Editor, and an anonymous referee for their helpful suggestions, which have improved the presentation of the article, and is especially grateful to the anonymous referee for pointing out that the predictive distribution in Section 3 is a negative binomial distribution.
This research is partially supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. Part of the work was completed during the author's visit at the Mathematics and Computational Science School, Xiangtan University, China, and was partially supported by the Natural Science Foundation of China (NSFC No. 30570426). The author thanks Xiangtan University for warm hospitality.
Notes
Recommended by Shelly Zacks