18
Views
2
CrossRef citations to date
0
Altmetric
Original Articles

On iterative optimization ol structured Markov decision processes with discounted rewards

, &
Pages 439-459 | Received 01 May 1982, Published online: 04 Mar 2011

References

  • Bartmann , D. 1979 . A method of bisection for dis counted Maekov decision problems . Zeitschrift für Oper, Res , 23 : 275 – 287 .
  • Bellman , R. 1957 . A Markovian decision process . J. Math. Mech , 6 : 679 – 684 .
  • De Gheilink , G.T. and Eppen , G.D. 1967 . Linear programming solutions for separable Markov decision problems . Management Sci , 6 : 371 – 394 .
  • Blackwell , D. 1965 . Discounted dynamic programming . Ann. Math. Statist , 36 : 226 – 235 .
  • Denardo , E.Y. 1908 . Separable Markovian Decision problems . Management Sci , 14 : 279 – 289 . problems. Management Sci
  • D'Epenoux , F. 1960 . Sur une probième de prod ta-t ion et de stockage dans l'aléatoire . Rev.Franc. Rech .Opér , 14 : 3 – 16 .
  • Hastings , N.A.J. 1968 . Some notes on dynamic programming ami replacement . Oper. Res Q , 19 : 453 – 464 .
  • Hastings , N.A.J. 1967 . A test for nonoptimal actions in undiscoundted finite Markov decision chains . Management Sci , 23 : 87 – 91 .
  • Hastings , N.A.J. and Van Nuenen , J.A.E.E . 1977 . “ The action elimination algorithem for Markov decision processes ” . In Markov decision theory. MC-tract 93, Mathematical Centre Edited by: Tijms , H. and Wessels , J. 161 – 170 . Amsterdam
  • Howard , R.A. 1960 . Dynamic programming and Markov decision processes , Cambridge, Mass : M.I.T. Press .
  • Hübner , G. Improved procedures for eliminating suboptimal actions in Markov programming by the use of contradiction properties . Transactions of the 7th Prague Conference on Information theory . pp. 257 – 263 . Prague : Academica . Statistical decision functions and Random Processes
  • Johansen , S.G. and Stidham , S. 1980 . Control of Arrivals to a Stochastic Input-Out- put System . Adv.Appl. Prob , 12 : 972 – 999 . JR
  • Macqueen , J. 1966 . A modified dynamic method for Markovian decision problems . J. Math, Anal. Appl , 14 : 38 – 43 .
  • Macqueen , J. 1967 . A test for subopfcimal actions in Markovian decisions problems . Oper. Res , 15 : 559 – 561 .
  • Manne , A.S. 1960 . Linear programming and sequential decisions . Management Sci , 6 : 259 – 267 .
  • Morton , T.E. and Wecker , W. 1977 . Discounting ergodieity and convergence for Markov decision processes . Management Sci , 23 : 890 – 900 .
  • Morton , T.E. 1971 . Undiscounted Markov Renewal programming via modified successive approximations . Oper. Res , 19 : 1081 – 1089 .
  • Van Nunen , J.A.E.E . 1976 . Contracting Markov decision processes , Amsterdam : Mathematical Centre (Mathematical Centre Tract 71) .
  • Van Nunen , J.A.E.E . 1976 . A set of successive approximation methods for discounted Markovian decision problems . Zeitschrift fur Oper. Res , 20 : 203 – 208 .
  • Van Nunen , J.A.E.E . 1981 . Action dependent stopping times and Markov decision processes with unbounded rewards . O.R.-Spectrum , 8 : 145 – 152 .
  • Van Nunen , J.A.E.E . 1983 . On computing optimal policies for G/M/S Queuing systems . Management Sci , 29 : 725 – 734 .
  • Porteus , E.L. 1971 . Some bounds for discounted sequential decision processes . Management Sci , 18 : 7 – 11 .
  • Porteus , E.L. 1975 . Bounds and transformations for discounted finite Markov decision chains . Oper. Res , 23 : 161 – 184 .
  • Porteus , E.L. 1978 . Improved iterative computation of the expected discounted return in Markov and seini-MARKQV chains , Stanford University . Research paper no. 344
  • Puterman , M.L. and Shin , M.C. 1978 . Modified policy iteration algorithms for discounted Markov decision problems . Management Sci , 24 : 1127 – 1137 .
  • Reetz , D. 1973 . Solution of a Markovian decision problem by overtaxation . Z. Oper, Res , 17 : 28 – 32 .
  • Scarf , H. 1960 . “ The Optimally of (S s) Policies in the Dynamic Inventory Problem ” . In Mathematical Methods in the Social Sciences , Edited by: Arrow , K.J. , Karlin , S. and Suppes , P. Standford : Stanford University Press . Chap. 13
  • Sebastian , H.J. and Sjeber , N. 1981 . Diskrete Dynamische Optimierung , Leipzig : Akndemische Yerhmsgesellschaft . Geest & Porting K.-G
  • Stidham , S. Jr . 1981 . On the Convergence of Successive Approximations in Dynamic Programming with Non-Zero Terminal Reward . Z. für Opns. Res , 25 : 57 – 77 .
  • Stidham , S. and Wijngaard , J. 1983 . Ship free Markov decision processes , Report North Carolina State University . to appear
  • Wessels , J. 1977 . Stopping times and Markov programming . Transactions of the 7th Prague Conference on Information theory, statistical decision funtions . 1977 . pp. 575 – 585 . Academia . Random processes Prague
  • Wessels , J. Markov decision processes: Implementation aspects.Memorandum Cosor 80-14 , Kindhovon : Eindhoven University of Technology . Department of Mathematics
  • Geilleit , R.A.A.M . “ A heuristic method for an inventory problem with two stages and two final products ” . In Department of Industrial Engineering , Eindhoven University of Technology . Research Report (1981) (to appear)
  • Wijngaard , J. Aug 1972 . An inventory problem with constraint order capacity , Aug , Eindhoven University of Technology . TH-report 72-wsk-03

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.