Search in:

International Journal of Production Research Volume 61, 2023 - Issue 6

Submit an article Journal homepage

Open access

3,890

Views

CrossRef citations to date

Altmetric

Research Articles

Using the proximal policy optimisation algorithm for solving the stochastic capacitated lot sizing problem

Lotte van Hezewijka Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, Eindhoven, Netherlands;b ORTEC B.V., Zoetermeer, NetherlandsCorrespondence[email protected]

https://orcid.org/0000-0001-8434-5505 View further author information

Nico Dellaerta Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, Eindhoven, Netherlands

https://orcid.org/0000-0003-2343-7574 View further author information

Tom Van Woensela Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, Eindhoven, Netherlands

https://orcid.org/0000-0003-4766-2346 View further author information

Noud Gademannb ORTEC B.V., Zoetermeer, Netherlands

https://orcid.org/0000-0002-1546-2879 View further author information

Pages 1955-1978 | Received 11 Mar 2021, Accepted 13 Mar 2022, Published online: 07 Apr 2022

Cite this article
https://doi.org/10.1080/00207543.2022.2056540
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Arrow, Kenneth J., Theodore Harris, and Jacob Marschak. 1951. “Optimal Inventory Policy.” Econometrica: Journal of the Econometric Society 19 (3): 250–272.
Web of Science ®Google Scholar
Axsäter, Sven. 1993. “ Continuous Review Policies for Multi-level Inventory Systems with Stochastic Demand.” In Logistics of Production and Inventory. Handbooks of Operations Research and Management Science, Vol. 4, edited by Stephen C. Graves, Alexander H. G. Rinnooy Kan, and Paul H. Zipkin, 175–197. North-Holland.
Google Scholar
Baker, Kenneth R. 1977. “An Experimental Study of the Effectiveness of Rolling Schedules in Production Planning.” Decision Sciences 8 (1): 19–27.
Google Scholar
Bellman, Richard. 1954. “The Theory of Dynamic Programming.” Bulletin of the American Mathematical Society 60 (6): 503–515.
Web of Science ®Google Scholar
Bookbinder, James, and Jin-Yan Tan. 1988. “Strategies for the Probabilistic Lot-Sizing Problem with Service-Level Constraints.” Management Science 34 (9): 1096–1108.
Web of Science ®Google Scholar
Boute, Robert N., Joren Gijsbrechts, Willem van Jaarsveld, and Nathalie Vanvuchelen. 2021. “Deep Reinforcement Learning for Inventory Control: A Roadmap.” European Journal of Operational Research 298 (2): 401–412.
Web of Science ®Google Scholar
Brandimarte, Paolo. 2006. “Multi-item Capacitated Lot-Sizing with Demand Uncertainty.” International Journal of Production Research 44 (15): 2997–3022.
Web of Science ®Google Scholar
Buşoniu, Lucian, Tim de Bruin, Domagoj Tolić, Jens Kober, and Ivana Palunko. 2018. “Reinforcement Learning for Control: Performance, Stability, and Deep Approximators.” Annual Reviews in Control 46: 8–28.
Web of Science ®Google Scholar
Chaharsooghi, S. Kamal, Jafar Heydari, and S. Hessameddin Zegordi. 2008. “A Reinforcement Learning Model for Supply Chain Ordering Management: An Application to the Beer Game.” Decision Support Systems 45 (4): 949–959.
Web of Science ®Google Scholar
DeCroix, Gregory A., and Antonio Arreola-Risa. 1998. “Optimal Production and Inventory Policy for Multiple Products under Resource Constraints.” Management Science 44 (7): 950–961.
Web of Science ®Google Scholar
De Smet, Niels, Stefan Minner, El Houssaine Aghezzaf, and Bram Desmet. 2020. “A Linearisation Approach to the Stochastic Dynamic Capacitated Lotsizing Problem with Sequence-Dependent Changeovers.” International Journal of Production Research 58 (16): 4980–5005.
Web of Science ®Google Scholar
Donselaar, Karel H. Van, Vishal Gaur, Tom Van Woensel, Rob A. C. M. Broekmeulen, and Jan C. Fransoo. 2010. “Ordering Behavior in Retail Stores and Implications for Automated Replenishment.” Management Science 56 (5): 766–784.
Web of Science ®Google Scholar
Erlenkotter, Donald. 1989. “Note – An Early Classic Misplaced: Ford W. Harris's Economic Order Quantity Model of 1915.” Management Science 35 (7): 898–900.
Web of Science ®Google Scholar
Federgruen, A., and P. Zipkin. 1984. “An Efficient Algorithm for Computing Optimal (s, S) Policies.” Operations Research 32 (6): 1268–1285.
Web of Science ®Google Scholar
Giannoccaro, Ilaria, and Pierpaolo Pontrandolfo. 2002. “Inventory Management in Supply Chains: A Reinforcement Learning Approach.” International Journal of Production Economics 78 (2): 153–161.
Web of Science ®Google Scholar
Gijsbrechts, Joren, Robert N. Boute, Jan Albert Van Mieghem, and Dennis Zhang. 2021. “Can Deep Reinforcement Learning Improve Inventory Management? Performance on Dual Sourcing, Lost Sales and Multi-echelon Problems.” Manufacturing & Service Operations Management.
Google Scholar
Glorot, Xavier, and Yoshua Bengio. 2010. “Understanding the Difficulty of Training Deep Feedforward Neural Networks.” Journal of Machine Learning Research 9: 249–256.
Google Scholar
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. Cambridge, MA: MIT Press.
Google Scholar
Gosavi, Abhijit, Emrah Ozkaya, and Aykut F. Kahraman. 2007. “Simulation Optimization for Revenue Management of Airlines with Cancellations and Overbooking.” OR Spectrum 29 (1): 21–38.
Web of Science ®Google Scholar
Graves, Stephen C. 1988. “Safety Stocks in Manufacturing Systems.” Journal of Manufacturing and Operations Management 1: 67–101.
Google Scholar
Graves, Stephen C., and Sean P. Willems. 2000. “Optimizing Strategic Safety Stock Placement in Supply Chains.” Manufacturing and Service Operations Management 2 (1): 68–83.
Google Scholar
Helber, Stefan, Florian Sahling, and Katja Schimmelpfeng. 2013. “Dynamic Capacitated Lot Sizing with Random Demand and Dynamic Safety Stocks.” OR Spectrum 35 (1): 75–105.
Web of Science ®Google Scholar
Heuillet, Alexandre, Fabien Couthouis, and Natalia Díaz-Rodríguez. 2021. “Explainability in Deep Reinforcement Learning.” Knowledge-Based Systems 214: 106685.
Web of Science ®Google Scholar
Jans, Raf, and Zeger Degraeve. 2008. “Modeling Industrial Lot Sizing Problems: A Review.” International Journal of Production Research 46 (6): 1619–1643.
Web of Science ®Google Scholar
Jiang, Chengzhi, and Zhaohan Sheng. 2009. “Case-Based Reinforcement Learning for Dynamic Inventory Control in a Multi-agent Supply-Chain System.” Expert Systems with Applications 36 (3): 6520–6526.
Web of Science ®Google Scholar
Kara, Ahmet, and Ibrahim Dogan. 2018. “Reinforcement Learning Approaches for Specifying Ordering Policies of Perishable Inventory Systems.” Expert Systems with Applications 91: 150–158.
Web of Science ®Google Scholar
Karimi, Behrooz, S. M. T. Fatemi Ghomi, and J. M. Wilson. 2003. “The Capacitated Lot Sizing Problem: A Review of Models and Algorithms.” Omega 31 (5): 365–378.
Web of Science ®Google Scholar
Kim, C. O., J. Jun, J. K. Baek, R. L. Smith, and Y. D. Kim. 2005. “Adaptive Inventory Control Models for Supply Chain Management.” International Journal of Advanced Manufacturing Technology 26 (9–10): 1184–1192.
Web of Science ®Google Scholar
Kingma, Diederik P., and Jimmy Ba. 2014. “Adam: A Method for Stochastic Optimization.” Preprint arXiv:1412.6980.
Google Scholar
Kool, Wouter, Herke van Hoof, and Max Welling. 2019. “Attention, Learn to Solve Routing Problems!,” 1–25. Preprint arXiv:1803.08475v3.
Google Scholar
Kwon, Ick-Hyun, Chang Ouk Kim, Jin Jun, and Jung Hoon Lee. 2008. “Case-Based Myopic Reinforcement Learning for Satisfying Target Service Level in Supply Chain.” Expert Systems with Applications 35 (1–2): 389–397.
Web of Science ®Google Scholar
Lee, Hau L. 2004. “The Triple – A Supply Chain.” Harvard Business Review 82 (10): 102–113.
PubMed Web of Science ®Google Scholar
Mnih, Volodymyr, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. "Asynchronous Methods for Deep Reinforcement Learning.” Proceedings of the 33rd International Conference on Machine Learning. New York, NY, USA.
Google Scholar
Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. “Playing Atari with Deep Reinforcement Learning,” 1–9. Preprint arXiv:1312.5602.
Google Scholar
Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, et al. 2015. “Human-Level Control Through Deep Reinforcement Learning.” Nature 518 (7540): 529–533.
PubMed Web of Science ®Google Scholar
Mula, Josefa, Raul Poler, Jose P. Garcia-Sabater, and Francisco Cruz Lario. 2006. “Models for Production Planning under Uncertainty: A Review.” International Journal of Production Economics 103 (1): 271–285.
Web of Science ®Google Scholar
Nahmias, Steven, and Tara Lennon Olsen. 2015a. “Inventory Control Subject to Known Demand.” In Production and Operations Analysis. 7, 198–248. Long Grove, IL: Waveland Press.
Google Scholar
Nahmias, Steven, and Tara Lennon Olsen. 2015b. “Inventory Control Subject to Uncertain Demand.” In Production and Operations Analysis. 7, 249–314. Long Grove, IL: Waveland Press.
Google Scholar
Oroojlooyjadid, Afshin, Mohammad Reza Nazari, Lawrence V. Snyder, and Martin Takáč. 2017. “A Deep Q-Network for the Beer Game: Reinforcement Learning for Inventory Optimization.” Preprint arXiv:1708.05924v3.
Google Scholar
Park, Junyoung, Jaehyeong Chun, Sang Hun Kim, Youngkook Kim, and Jinkyoo Park. 2021. “Learning to Schedule Job-Shop Problems: Representation and Policy Learning Using Graph Neural Network and Reinforcement Learning.” International Journal of Production Research 59 (11): 3360–3377.
Web of Science ®Google Scholar
Paternina-Arboleda, Carlos D., and Tapas K. Das. 2005. “A Multi-Agent Reinforcement Learning Approach to Obtaining Dynamic Control Policies for Stochastic Lot Scheduling Problem.” Simulation Modelling Practice and Theory 13 (5): 389–406.
Web of Science ®Google Scholar
Pontrandolfo, P., A. Gosavi, O. G. Okogbaa, and T. K. Das. 2002. “Global Supply Chain Management: A Reinforcement Learning Approach.” International Journal of Production Research 40 (6): 1299–1317.
Web of Science ®Google Scholar
Powell, Warren B. 2007. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Vol. 703. Hoboken, NJ: John Wiley & Sons.
Google Scholar
Rummukainen, Hannu, and Jukka K. Nurminen. 2019. “Practical Reinforcement Learning Experiences in Scheduling Application.” IFAC PapersOnLine 52 (13): 1415–1420.
Google Scholar
Schulman, John, Sergey Levine, Philipp Moritz, Michael Jordan, and Pieter Abbeel. 2015. “Trust Region Policy Optimization.” Proceedings of the 32nd International Conference on Machine Learning. Lille, France.
Google Scholar
Schulman, John, Philipp Moritz, Sergey Levine, Michael I. Jordan, and Pieter Abbeel. 2016. “High-Dimensional Continuous Control Using Generalized Advantage Estimation.” Conference Track Proceedings of the 4th International Conference on Learning Representations.San Jose, Puerto Rico.
Google Scholar
Schulman, John, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. “Proximal Policy Optimization Algorithms.” Preprint arXiv:1707.06347, 1–12.
Google Scholar
Shannon, Claude Elwood. 1948. “A Mathematical Theory of Communication.” The Bell System Technical Journal 27 (3): 379–423.
Google Scholar
Sutton, Richard S., and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press.
Google Scholar
Tang, Jie, and Pieter Abbeel. 2010. “On a Connection Between Importance Sampling and the Likelihood Ratio Policy Gradient.” Advances in Neural Information Processing Systems 23: 1000–1008.
Google Scholar
Tavaghof-Gigloo, Dariush, and Stefan Minner. 2020. “Planning Approaches for Stochastic Capacitated Lot-Sizing with Service Level Constraints.” International Journal of Production Research 59 (17): 5087–5107.
Web of Science ®Google Scholar
Tempelmeier, Horst. 2006. Inventory Management in Supply Networks: Problems, Models, Solutions. Köln, Germany: University of Cologne.
Google Scholar
Tempelmeier, Horst. 2011. “A Column Generation Heuristic for Dynamic Capacitated Lot Sizing with Random Demand Under a Fill Rate Constraint.” Omega 39 (6): 627–633.
Web of Science ®Google Scholar
Tempelmeier, Horst. 2013. “Stochastic Lot Sizing Problems.” In Handbook of Stochastic Models and Analysis of Manufacturing System Operations. International Series in Operations Research & Management Science , 313–344. New York: Springer.
Google Scholar
Tempelmeier, Horst, and Timo Hilger. 2015. “Linear Programming Models for a Stochastic Dynamic Capacitated Lot Sizing Problem.” Computers and Operations Research 59: 119–125.
Web of Science ®Google Scholar
Vanvuchelen, Nathalie, Joren Gijsbrechts, and Robert Boute. 2020. “Use of Proximal Policy Optimization for the Joint Replenishment Problem.” Computers in Industry 119: 103239.
Web of Science ®Google Scholar
Wang, Jiao, Xueping Li, and Xiaoyan Zhu. 2012. “Intelligent Dynamic Control of Stochastic Economic Lot Scheduling by Agent-based Reinforcement Learning.” International Journal of Production Research 50 (16): 4381–4395.
Web of Science ®Google Scholar
Winands, Erik M. M., Ivo JBF Adan, and Geert-Jan van Houtum. 2011. “The Stochastic Economic Lot Scheduling Problem: A Survey.” European Journal of Operational Research 210 (1): 1–9.
Web of Science ®Google Scholar
Zahavy, Tom, Matan Haroush, Nadav Merlis, Daniel J. Mankowitz, and Shie Mannor. 2018. “Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning.” Preprint arXiv:1809.02121, 1–12.
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Using the proximal policy optimisation algorithm for solving the stochastic capacitated lot sizing problem

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Using the proximal policy optimisation algorithm for solving the stochastic capacitated lot sizing problem

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date