194
Views
12
CrossRef citations to date
0
Altmetric
Original Articles

Look-ahead control of conveyor-serviced production station by using potential-based online policy iteration

&
Pages 1917-1928 | Received 10 Aug 2008, Accepted 15 Feb 2009, Published online: 06 Aug 2009

References

  • Abounadi , J , Bertsekas , D and Borkar , V . 2001 . Learning Algorithms for Markov Decision Processes with Average Cost . SIAM Journal of Control Optimization , 40 : 681 – 698 .
  • Beightler , CS and Crisp Jr , RM . 1968 . A Discrete-time Queueing Analysis of Conveyor-serviced Production Stations . Operations Research , 16 : 986 – 1001 .
  • Bertsekas , DP . 2001 . Dynamic Programming and Optimal Control , Belmont, MA : Athena Scientific .
  • Bradtke , SJ and Duff , MO . 1995 . “ Reinforcement Learning Methods for Continuous-time Markov Decision Problems ” . In in Advances in Neural Information Processing Systems 7 , 393 – 400 . Cambridge, MA : MIT Press .
  • Cao , XR . 2007 . Stochastic Learning and Optimisation: A Sensitivity-based View , New York : Springer .
  • Das , TK , Gosavi , A , Mahadevan , S and Marchalleck , N . 1999 . Solving Semi-Markov Decision Problems using Average Reward Reinforcement Learning . Management Science , 45 : 560 – 574 .
  • Fang , HT and Cao , XR . 2004 . Potential-based Online Policy Iteration Algorithms for Markov Decision Processes . IEEE Transactions on Automatic Control , 49 : 493 – 505 .
  • Gosavi , A . 2003 . Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning , Boston : Kluwer .
  • Gosavi , A . 2004a . A Reinforcement Learning Algorithm Based on Policy Iteration for Average Reward: Empirical Results with Yield Management and Convergence Analysis . Machine Learning , 55 : 5 – 29 .
  • Gosavi , A . 2004b . Reinforcement Learning for Long Run Average Cost . European Journal of Operational Research , 155 : 654 – 674 .
  • Lagoudakis , MG and Parr , R . 2003 . Least-squares Policy Iteration . Journal of Machine Learning Research , 40 : 1107 – 1149 .
  • Matsui , M . 1993 . A Generalised Model of Conveyor-serviced Production Station (CSPS) (in Japanese) . Journal of Japan Industrial Management Association , 44 : 25 – 32 .
  • Matsui , M . 2005 . CSPS Model: Look-ahead Controls and Physics . International Journal of Production Research , 43 : 2001 – 2025 .
  • Matsui , M and Shingu , T . 1978 . A Queueing Analysis of Conveyor-serviced Production Station and the Optimal Range Strategy . AIIE Transactions , 10 : 89 – 99 .
  • Muth , EJ and White , JA . 1979 . Conveyor Theory: A Survey . AIIE Transactions , 11 : 270 – 277 .
  • Nawijn , WM . 1981 . The Analysis of a Conveyor-serviced Production Station . European Journal of Operational Research , 6 : 67 – 74 .
  • Nawijn , WM . 1985 . The Optimal Look-ahead Policy for Admission to a Single Server System . Operations Research , 33 : 626 – 643 .
  • Puterman , ML . 1994 . Markov Decision Processes: Discrete Stochastic Dynamic Programming , New York : Wiley .
  • Schwartz , A . 1993 . “ A Reinforcement Learning Method for Maximising Undiscounted Rewards ” . In Proceeding of the Tenth Annual Conference on Machine Learning , 298 – 305 . Amherst, MA : Morgan Kaufmann .
  • Singh , S , Tadic , B and Doucet , A . 2007 . A Policy Gradient Method for Semi-Markov Decision Processes with Application to Call Admission Control . European Journal of Operational Research , 178 : 808 – 818 .
  • Tang , H , Xi , HS and Yin , BQ . 2005a . The Optimal Robust Control Policy for Uncertain Semi-Markov Control Processes . International Journal of Systems Science , 36 : 791 – 800 .
  • Tang , H , Yuan , JB , Lu , Y and Cheng , WJ . 2005b . Performance Potential-based Neuro-dynamic Programming for SMDPs . Acta Automatica Sinica , 31 : 642 – 645 .
  • Tang , H , Xi , HS and Yin , BQ . 2007 . Error Bounds of Optimisation Algorithms for Semi-Markov Decision Processes . International Journal of Systems Science , 38 : 725 – 736 .
  • Watkins , CJCH . 1989 . “ Learning from Delayed Rewards ” . In PhD thesis , Cambridge, , UK : Cambridge University .
  • Yin , BQ , Xi , HS and Zhou , YP . 2004 . Queueing System Performance Analysis and Markov Control Processes , Hefei : Press of University of Science and Technology of China .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.