365
Views
18
CrossRef citations to date
0
Altmetric
Original Articles

A reinforcement learning approach to stochastic business games

, &
Pages 373-385 | Received 01 Feb 2001, Accepted 01 Nov 2003, Published online: 17 Aug 2010

References

  • Abounadi , J. , Bertsekas , D. and Borkar , V. S. 1998 . Learning algorithms for Markov decision processes with average cost report. LIDS-P-2434 , Cambridge, MA : Laboratory for Information and Decision Systems, MIT .
  • Anupindi , R. , Bassok , Y. and Zemel , E. 2001 . A general framework for the study of decentralized distribution systems . Journal of Manufacturing and Service Operations Management , : 4
  • Bellman , R. E. 1957 . Dynamic Programming , Princeton, NJ : Princeton University Press .
  • Bertsekas , D. and Tsitsiklis , J. 1996 . Neurodynamic Programming , Belmont, MA : Athena Scientific .
  • Darken , C. , Chang , J. and Moody , J. 1992 . “ Learning rate schedules for faster stochastic gradient search ” . In Neural Networks for Signal Processing 2—Proceedings of the 1992 IEEE Workshop , Edited by: White , D. A. and Sofge , D. A. Piscataway, NJ : IEEE Press .
  • Das , T. K. , Gosavi , A. , Mahadevan , S. and Marchalleck , N. 1999 . Solving semi-Markov decision problems using average reward reinforcement learning . Management Science , 45 ( 4 ) : 560 – 574 .
  • Erev , I. and Roth , A. E. 1998 . Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria . The American Economic Review , 88 ( 4 ) : 848 – 881 .
  • Filar , J. and Vrieze , K. 1997 . Competitive Markov Decision Processes , New York, NY : Springer-Verlag .
  • Gosavi , A. 2004 . “ Reinforcement learning for long-run average cost ” . In European Journal of Operations Research to appear
  • Gosavi , A. , Bandla , N. and Das , T. K. 2002 . A reinforcement learning approach to airline seat allocation for multiple fare classes with overbooking . IIE Transactions , 34 ( 9 ) : 729 – 742 .
  • Hu , J. and Wellman , M. P. 1998 . “ Multi-agent reinforcement learning: theoretical framework and an algorithm ” . In Proceedings of the 15th International Conference on Machine Learning 242 – 250 .
  • Li , J. and Das , T. K. 2003 . Learning Nash equilibrium for average reward irreducible stochastic games , Tampa, FL : University of South Florida . Working paper, Department of Industrial and Management Systems Engineering
  • Littman , M. L. 1994 . “ Markov games as a framework for multi-agent reinforcement learning ” . In Proceedings of the 11th International Conference on Machine Learning 157 – 163 .
  • Nash , J. F. 1951 . Non-cooperative games . Annals of Mathematics , 54 : 286 – 295 .
  • Owen , G. 1975 . On the core of linear production games . Mathamatical Programming , 9 : 358 – 370 .
  • Paternina , C. D. and Das , T. K. 2000 . Intelligent dynamic control policies for serial production lines . IIE Transactions , 33 ( 1 ) : 65 – 77 .
  • Puterman , M. L. 1994 . Markov Decision Processes , New York, NY : Wiley .
  • Ripley , B. D. 1996 . Pattern Recognition and Neural Networks , Oxford, UK : Cambridge University Press .
  • Robbins , H. and Monro , S. 1951 . A stochastic approximation method . Annals of Mathematical and Statistics , 22 : 400 – 407 .
  • Shapley , L. and Shubik , M. 1975 . Competitive outcomes in the core of market games Technical report R-1692-NSF, The Rand Corporation
  • Sutton , R. S. and Barto , A. 1998 . Reinforcement Learning , Cambridge, MA : MIT Press .
  • Van der Lann , G. , Talman , A. J. J. and Van der Heyden , L. 1987 . “ Simplicial variable dimension algorithms for solving the nonlinear complimentary problem on a product of unit simplices using a general labeling ” . In Mathematics of Operations Research 377 – 397 .
  • Van Roy , B. 1998 . Learning and value function approximation in complex decision processes , Cambridge, MA : Laboratory for Information and Decision Systems, MIT . Ph.D. thesis
  • Watkins , C. J. C. H. 1989 . Learning from delayed rewards , Cambridge, UK : Cambridge University . Ph.D. thesis

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.