3,890
Views
5
CrossRef citations to date
0
Altmetric
Research Articles

Using the proximal policy optimisation algorithm for solving the stochastic capacitated lot sizing problem

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 1955-1978 | Received 11 Mar 2021, Accepted 13 Mar 2022, Published online: 07 Apr 2022

References

  • Arrow, Kenneth J., Theodore Harris, and Jacob Marschak. 1951. “Optimal Inventory Policy.” Econometrica: Journal of the Econometric Society 19 (3): 250–272.
  • Axsäter, Sven. 1993. “ Continuous Review Policies for Multi-level Inventory Systems with Stochastic Demand.” In Logistics of Production and Inventory. Handbooks of Operations Research and Management Science, Vol. 4, edited by Stephen C. Graves, Alexander H. G. Rinnooy Kan, and Paul H. Zipkin, 175–197. North-Holland.
  • Baker, Kenneth R. 1977. “An Experimental Study of the Effectiveness of Rolling Schedules in Production Planning.” Decision Sciences 8 (1): 19–27.
  • Bellman, Richard. 1954. “The Theory of Dynamic Programming.” Bulletin of the American Mathematical Society 60 (6): 503–515.
  • Bookbinder, James, and Jin-Yan Tan. 1988. “Strategies for the Probabilistic Lot-Sizing Problem with Service-Level Constraints.” Management Science 34 (9): 1096–1108.
  • Boute, Robert N., Joren Gijsbrechts, Willem van Jaarsveld, and Nathalie Vanvuchelen. 2021. “Deep Reinforcement Learning for Inventory Control: A Roadmap.” European Journal of Operational Research 298 (2): 401–412.
  • Brandimarte, Paolo. 2006. “Multi-item Capacitated Lot-Sizing with Demand Uncertainty.” International Journal of Production Research 44 (15): 2997–3022.
  • Buşoniu, Lucian, Tim de Bruin, Domagoj Tolić, Jens Kober, and Ivana Palunko. 2018. “Reinforcement Learning for Control: Performance, Stability, and Deep Approximators.” Annual Reviews in Control 46: 8–28.
  • Chaharsooghi, S. Kamal, Jafar Heydari, and S. Hessameddin Zegordi. 2008. “A Reinforcement Learning Model for Supply Chain Ordering Management: An Application to the Beer Game.” Decision Support Systems 45 (4): 949–959.
  • DeCroix, Gregory A., and Antonio Arreola-Risa. 1998. “Optimal Production and Inventory Policy for Multiple Products under Resource Constraints.” Management Science 44 (7): 950–961.
  • De Smet, Niels, Stefan Minner, El Houssaine Aghezzaf, and Bram Desmet. 2020. “A Linearisation Approach to the Stochastic Dynamic Capacitated Lotsizing Problem with Sequence-Dependent Changeovers.” International Journal of Production Research 58 (16): 4980–5005.
  • Donselaar, Karel H. Van, Vishal Gaur, Tom Van Woensel, Rob A. C. M. Broekmeulen, and Jan C. Fransoo. 2010. “Ordering Behavior in Retail Stores and Implications for Automated Replenishment.” Management Science 56 (5): 766–784.
  • Erlenkotter, Donald. 1989. “Note – An Early Classic Misplaced: Ford W. Harris's Economic Order Quantity Model of 1915.” Management Science 35 (7): 898–900.
  • Federgruen, A., and P. Zipkin. 1984. “An Efficient Algorithm for Computing Optimal (s, S) Policies.” Operations Research 32 (6): 1268–1285.
  • Giannoccaro, Ilaria, and Pierpaolo Pontrandolfo. 2002. “Inventory Management in Supply Chains: A Reinforcement Learning Approach.” International Journal of Production Economics 78 (2): 153–161.
  • Gijsbrechts, Joren, Robert N. Boute, Jan Albert Van Mieghem, and Dennis Zhang. 2021. “Can Deep Reinforcement Learning Improve Inventory Management? Performance on Dual Sourcing, Lost Sales and Multi-echelon Problems.” Manufacturing & Service Operations Management.
  • Glorot, Xavier, and Yoshua Bengio. 2010. “Understanding the Difficulty of Training Deep Feedforward Neural Networks.” Journal of Machine Learning Research 9: 249–256.
  • Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. Cambridge, MA: MIT Press.
  • Gosavi, Abhijit, Emrah Ozkaya, and Aykut F. Kahraman. 2007. “Simulation Optimization for Revenue Management of Airlines with Cancellations and Overbooking.” OR Spectrum 29 (1): 21–38.
  • Graves, Stephen C. 1988. “Safety Stocks in Manufacturing Systems.” Journal of Manufacturing and Operations Management 1: 67–101.
  • Graves, Stephen C., and Sean P. Willems. 2000. “Optimizing Strategic Safety Stock Placement in Supply Chains.” Manufacturing and Service Operations Management 2 (1): 68–83.
  • Helber, Stefan, Florian Sahling, and Katja Schimmelpfeng. 2013. “Dynamic Capacitated Lot Sizing with Random Demand and Dynamic Safety Stocks.” OR Spectrum 35 (1): 75–105.
  • Heuillet, Alexandre, Fabien Couthouis, and Natalia Díaz-Rodríguez. 2021. “Explainability in Deep Reinforcement Learning.” Knowledge-Based Systems 214: 106685.
  • Jans, Raf, and Zeger Degraeve. 2008. “Modeling Industrial Lot Sizing Problems: A Review.” International Journal of Production Research 46 (6): 1619–1643.
  • Jiang, Chengzhi, and Zhaohan Sheng. 2009. “Case-Based Reinforcement Learning for Dynamic Inventory Control in a Multi-agent Supply-Chain System.” Expert Systems with Applications 36 (3): 6520–6526.
  • Kara, Ahmet, and Ibrahim Dogan. 2018. “Reinforcement Learning Approaches for Specifying Ordering Policies of Perishable Inventory Systems.” Expert Systems with Applications 91: 150–158.
  • Karimi, Behrooz, S. M. T. Fatemi Ghomi, and J. M. Wilson. 2003. “The Capacitated Lot Sizing Problem: A Review of Models and Algorithms.” Omega 31 (5): 365–378.
  • Kim, C. O., J. Jun, J. K. Baek, R. L. Smith, and Y. D. Kim. 2005. “Adaptive Inventory Control Models for Supply Chain Management.” International Journal of Advanced Manufacturing Technology 26 (9–10): 1184–1192.
  • Kingma, Diederik P., and Jimmy Ba. 2014. “Adam: A Method for Stochastic Optimization.” Preprint arXiv:1412.6980.
  • Kool, Wouter, Herke van Hoof, and Max Welling. 2019. “Attention, Learn to Solve Routing Problems!,” 1–25. Preprint arXiv:1803.08475v3.
  • Kwon, Ick-Hyun, Chang Ouk Kim, Jin Jun, and Jung Hoon Lee. 2008. “Case-Based Myopic Reinforcement Learning for Satisfying Target Service Level in Supply Chain.” Expert Systems with Applications 35 (1–2): 389–397.
  • Lee, Hau L. 2004. “The Triple – A Supply Chain.” Harvard Business Review 82 (10): 102–113.
  • Mnih, Volodymyr, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. "Asynchronous Methods for Deep Reinforcement Learning.” Proceedings of the 33rd International Conference on Machine Learning. New York, NY, USA.
  • Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. “Playing Atari with Deep Reinforcement Learning,” 1–9. Preprint arXiv:1312.5602.
  • Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, et al. 2015. “Human-Level Control Through Deep Reinforcement Learning.” Nature 518 (7540): 529–533.
  • Mula, Josefa, Raul Poler, Jose P. Garcia-Sabater, and Francisco Cruz Lario. 2006. “Models for Production Planning under Uncertainty: A Review.” International Journal of Production Economics 103 (1): 271–285.
  • Nahmias, Steven, and Tara Lennon Olsen. 2015a. “Inventory Control Subject to Known Demand.” In Production and Operations Analysis. 7, 198–248. Long Grove, IL: Waveland Press.
  • Nahmias, Steven, and Tara Lennon Olsen. 2015b. “Inventory Control Subject to Uncertain Demand.” In Production and Operations Analysis. 7, 249–314. Long Grove, IL: Waveland Press.
  • Oroojlooyjadid, Afshin, Mohammad Reza Nazari, Lawrence V. Snyder, and Martin Takáč. 2017. “A Deep Q-Network for the Beer Game: Reinforcement Learning for Inventory Optimization.” Preprint arXiv:1708.05924v3.
  • Park, Junyoung, Jaehyeong Chun, Sang Hun Kim, Youngkook Kim, and Jinkyoo Park. 2021. “Learning to Schedule Job-Shop Problems: Representation and Policy Learning Using Graph Neural Network and Reinforcement Learning.” International Journal of Production Research 59 (11): 3360–3377.
  • Paternina-Arboleda, Carlos D., and Tapas K. Das. 2005. “A Multi-Agent Reinforcement Learning Approach to Obtaining Dynamic Control Policies for Stochastic Lot Scheduling Problem.” Simulation Modelling Practice and Theory 13 (5): 389–406.
  • Pontrandolfo, P., A. Gosavi, O. G. Okogbaa, and T. K. Das. 2002. “Global Supply Chain Management: A Reinforcement Learning Approach.” International Journal of Production Research 40 (6): 1299–1317.
  • Powell, Warren B. 2007. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Vol. 703. Hoboken, NJ: John Wiley & Sons.
  • Rummukainen, Hannu, and Jukka K. Nurminen. 2019. “Practical Reinforcement Learning Experiences in Scheduling Application.” IFAC PapersOnLine 52 (13): 1415–1420.
  • Schulman, John, Sergey Levine, Philipp Moritz, Michael Jordan, and Pieter Abbeel. 2015. “Trust Region Policy Optimization.” Proceedings of the 32nd International Conference on Machine Learning. Lille, France.
  • Schulman, John, Philipp Moritz, Sergey Levine, Michael I. Jordan, and Pieter Abbeel. 2016. “High-Dimensional Continuous Control Using Generalized Advantage Estimation.” Conference Track Proceedings of the 4th International Conference on Learning Representations.San Jose, Puerto Rico.
  • Schulman, John, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. “Proximal Policy Optimization Algorithms.” Preprint arXiv:1707.06347, 1–12.
  • Shannon, Claude Elwood. 1948. “A Mathematical Theory of Communication.” The Bell System Technical Journal 27 (3): 379–423.
  • Sutton, Richard S., and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press.
  • Tang, Jie, and Pieter Abbeel. 2010. “On a Connection Between Importance Sampling and the Likelihood Ratio Policy Gradient.” Advances in Neural Information Processing Systems 23: 1000–1008.
  • Tavaghof-Gigloo, Dariush, and Stefan Minner. 2020. “Planning Approaches for Stochastic Capacitated Lot-Sizing with Service Level Constraints.” International Journal of Production Research 59 (17): 5087–5107.
  • Tempelmeier, Horst. 2006. Inventory Management in Supply Networks: Problems, Models, Solutions. Köln, Germany: University of Cologne.
  • Tempelmeier, Horst. 2011. “A Column Generation Heuristic for Dynamic Capacitated Lot Sizing with Random Demand Under a Fill Rate Constraint.” Omega 39 (6): 627–633.
  • Tempelmeier, Horst. 2013. “Stochastic Lot Sizing Problems.” In Handbook of Stochastic Models and Analysis of Manufacturing System Operations. International Series in Operations Research & Management Science , 313–344. New York: Springer.
  • Tempelmeier, Horst, and Timo Hilger. 2015. “Linear Programming Models for a Stochastic Dynamic Capacitated Lot Sizing Problem.” Computers and Operations Research 59: 119–125.
  • Vanvuchelen, Nathalie, Joren Gijsbrechts, and Robert Boute. 2020. “Use of Proximal Policy Optimization for the Joint Replenishment Problem.” Computers in Industry 119: 103239.
  • Wang, Jiao, Xueping Li, and Xiaoyan Zhu. 2012. “Intelligent Dynamic Control of Stochastic Economic Lot Scheduling by Agent-based Reinforcement Learning.” International Journal of Production Research 50 (16): 4381–4395.
  • Winands, Erik M. M., Ivo JBF Adan, and Geert-Jan van Houtum. 2011. “The Stochastic Economic Lot Scheduling Problem: A Survey.” European Journal of Operational Research 210 (1): 1–9.
  • Zahavy, Tom, Matan Haroush, Nadav Merlis, Daniel J. Mankowitz, and Shie Mannor. 2018. “Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning.” Preprint arXiv:1809.02121, 1–12.