Search in:

International Journal of Production Research Volume 62, 2024 - Issue 17

Submit an article Journal homepage

592

Views

CrossRef citations to date

Altmetric

Research Articles

Performance of deep reinforcement learning algorithms in two-echelon inventory control systems

Francesco Stranieria Department of Informatics, Systems, and Communication (DISCo), University of Milano-Bicocca, Milan, Italy;b Department of Control and Computer Engineering (DAUIN), Polytechnic of Turin, Turin, ItalyCorrespondence[email protected]
View further author information

Fabio Stellaa Department of Informatics, Systems, and Communication (DISCo), University of Milano-Bicocca, Milan, ItalyView further author information

Chaaben Koukic Department of Operations Management and Decision Science (OMDS), ESSCA School of Management, Angers, FranceView further author information

Pages 6211-6226 | Received 11 May 2023, Accepted 19 Jan 2024, Published online: 01 Mar 2024

Cite this article
https://doi.org/10.1080/00207543.2024.2311180
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Alves, J. C., and G. R. Mateus. 2020. “Deep Reinforcement Learning and Optimization Approach for Multi-Echelon Supply Chain with Uncertain Demands.” In Lecture Notes in Computer Science, 584–599. Springer International Publishing. https://doi.org/10.1007/978-3-030-59747-4_38.
Google Scholar
Boute, R. N., J. Gijsbrechts, W. van Jaarsveld, and N. Vanvuchelen. 2022, April. “Deep Reinforcement Learning for Inventory Control: A Roadmap.” European Journal of Operational Research 298 (2): 401–412. https://doi.org/10.1016/j.ejor.2021.07.016.
Web of Science ®Google Scholar
Brockman, G., V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba. 2016. “Openai Gym.” https://arxiv.org/abs/1606.01540.
Google Scholar
Chaharsooghi, S. K., J. Heydari, and S. H. Zegordi. 2008, November. “A Reinforcement Learning Model for Supply Chain Ordering Management: An Application to the Beer Game.” Decision Support Systems45 (4): 949–959. https://doi.org/10.1016/j.dss.2008.03.007.
Web of Science ®Google Scholar
Chao, X., X. Gong, C. Shi, and H. Zhang. 2015, June. “Approximation Algorithms for Perishable Inventory Systems.” Operations Research 63 (3): 585–601. https://doi.org/10.1287/opre.2015.1386.
Web of Science ®Google Scholar
Dehaybe, H., D. Catanzaro, and P. Chevalier. 2023, October. “Deep Reinforcement Learning for Inventory Optimization with Non-Stationary Uncertain Demand.” European Journal of Operational Research. https://doi.org/10.1016/j.ejor.2023.10.007.
Web of Science ®Google Scholar
de Kok, T., C. Grob, M. Laumanns, S. Minner, J. Rambau, and K. Schade. 2018, September. “A Typology and Literature Review on Stochastic Multi-Echelon Inventory Models.” European Journal of Operational Research 269 (3): 955–983. https://doi.org/10.1016/j.ejor.2018.02.047.
Web of Science ®Google Scholar
François-Lavet, V., P. Henderson, R. Islam, M. G. Bellemare, and J. Pineau. 2018. “An Introduction to Deep Reinforcement Learning.” Foundations and Trends® in Machine Learning 11 (3–4): 219–354. https://doi.org/10.1561/2200000071.
Google Scholar
Geevers, K., L. van Hezewijk, and M. R. K. Mes. 2023, July. “Multi-Echelon Inventory Optimization Using Deep Reinforcement Learning.” Central European Journal of Operations Research. https://doi.org/10.1007/s10100-023-00872-2.
Web of Science ®Google Scholar
Gijsbrechts, J., R. N. Boute, J. A. V. Mieghem, and D. J. Zhang. 2022, May. “Can Deep Reinforcement Learning Improve Inventory Management? Performance on Lost Sales, Dual-sourcing, and Multi-Echelon Problems.” Manufacturing & Service Operations Management 24 (3): 1349–1368. https://doi.org/10.1287/msom.2021.1064.
Google Scholar
Gordon, G. J., and T. M. Mitchell. 1999. “Approximate Solutions to Markov Decision Processes”.
Google Scholar
Hubbs, C. D., H. D. Perez, O. Sarwar, N. V. Sahinidis, I. E. Grossmann, and J. M. Wassick. 2020. “Or-Gym: A Reinforcement Learning Library for Operations Research Problems.” https://arxiv.org/abs/2008.06319.
Google Scholar
Huh, W. T., G. Janakiraman, J. A. Muckstadt, and P. Rusmevichientong. 2009, March. “Asymptotic Optimality of Order-up-to Policies in Lost Sales Inventory Systems.” Management Science 55 (3): 404–420. https://doi.org/10.1287/mnsc.1080.0945.
Web of Science ®Google Scholar
Jaakkola, T., M. I. Jordan, and S. P. Singh. 1994, November. “On the Convergence of Stochastic Iterative Dynamic Programming Algorithms.” Neural Computation 6 (6): 1185–1201. https://doi.org/10.1162/neco.1994.6.6.1185.
Web of Science ®Google Scholar
Jackson, I., M. Jesus Saenz, and D. Ivanov. 2023, November. “From Natural Language to Simulations: Applying AI to Automate Simulation Modelling of Logistics Systems.” International Journal of Production Research 1–24. https://doi.org/10.1080/00207543.2023.2276811.
Web of Science ®Google Scholar
Kaynov, I., M. van Knippenberg, V. Menkovski, A. van Breemen, and W. van Jaarsveld. 2024, January. “Deep Reinforcement Learning for One-Warehouse Multi-Retailer Inventory Management.” International Journal of Production Economics 267:109088. https://doi.org/10.1016/j.ijpe.2023.109088.
Web of Science ®Google Scholar
Kemmer, L., H. von Kleist, D. de Rochebouët, N. Tziortziotis, and J. Read. 2018. “Reinforcement Learning for Supply Chain Optimization.” In European Workshop on Reinforcement Learning, Vol. 14. Lille, France.
Google Scholar
Liu, X., and Y. L. Tu. 2008, January. “Capacitated Production Planning with Outsourcing in An OKP Company.” International Journal of Production Research 46 (20): 5781–5795. https://doi.org/10.1080/00207540701348779.
Web of Science ®Google Scholar
Mnih, V., A. P. Badia, M. Mirza, A. Graves, T. Harley, T. P. Lillicrap, D. Silver, and K. Kavukcuoglu. 2016. “Asynchronous Methods for Deep Reinforcement Learning.” In Proceedings of the 33rd International Conference on International Conference on Machine Learning -- Volume 48, ICML'16, 1928–1937. JMLR.org.
Google Scholar
Mnih, V., K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, et al. 2015, February. “Human-Level Control Through Deep Reinforcement Learning.” Nature 518 (7540): 529–533. https://doi.org/10.1038/nature14236.
PubMed Web of Science ®Google Scholar
Moritz, P., R. Nishihara, S. Wang, A. Tumanov, R. Liaw, E. Liang, M. Elibol, et al. 2018, October. “Ray: A Distributed Framework for Emerging AI Applications.” In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), 561–577, Carlsbad, CA: USENIX Association. https://www.usenix.org/conference/osdi18/presentation/moritz.
Google Scholar
Oroojlooyjadid, A., M. Nazari, L. V. Snyder, and M. Takáč. 2022, January. “A Deep Q-Network for the Beer Game: Deep Reinforcement Learning for Inventory Optimization.” Manufacturing & Service Operations Management 24 (1): 285–304. https://doi.org/10.1287/msom.2020.0939.
Web of Science ®Google Scholar
Peng, Z., Y. Zhang, Y. Feng, T. Zhang, Z. Wu, and H. Su. 2019, November. “Deep Reinforcement Learning Approach for Capacitated Supply Chain Optimization Under Demand Uncertainty.” In 2019 Chinese Automation Congress (CAC). IEEE. https://doi.org/10.1109/cac48633.2019.8997498.
Google Scholar
Punia, S., K. Nikolopoulos, S. P. Singh, J. K. Madaan, and K. Litsiou. 2020, March. “Deep Learning with Long Short-Term Memory Networks and Random Forests for Demand Forecasting in Multi-Channel Retail.” International Journal of Production Research 58 (16): 4964–4979. https://doi.org/10.1080/00207543.2020.1735666.
Web of Science ®Google Scholar
Rolf, B., I. Jackson, M. Müller, S. Lang, T. Reggelin, and D. Ivanov. 2022, November. “A Review on Reinforcement Learning Algorithms and Applications in Supply Chain Management.” International Journal of Production Research 61 (20): 7151–7179. https://doi.org/10.1080/00207543.2022.2140221.
Web of Science ®Google Scholar
Schulman, J., S. Levine, P. Abbeel, M. Jordan, and P. Moritz. 2015, 07–09 July. “Trust Region Policy Optimization.” In Proceedings of the 32nd International Conference on Machine Learning, Vol. 37 of Proceedings of Machine Learning Research, edited by F. Bach and D. Blei, 1889–1897. Lille, France: PMLR. https://proceedings.mlr.press/v37/schulman15.html.
Google Scholar
Schulman, J., F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. “Proximal Policy Optimization Algorithms.” https://arxiv.org/abs/1707.06347.
Google Scholar
Shajalal, M., P. Hajek, and M. Z. Abedin. 2021, March. “Product Backorder Prediction Using Deep Neural Network on Imbalanced Data.” International Journal of Production Research 61 (1): 302–319. https://doi.org/10.1080/00207543.2021.1901153.
Web of Science ®Google Scholar
Sharma, R., A. Shishodia, A. Gunasekaran, H. Min, and Z. H. Munim. 2022, February. “The Role of Artificial Intelligence in Supply Chain Management: Mapping the Territory.” International Journal of Production Research 60 (24): 7527–7550. https://doi.org/10.1080/00207543.2022.2029611.
Web of Science ®Google Scholar
Silver, D., J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, et al. 2017, October. “Mastering the Game of Go Without Human Knowledge.” Nature 550 (7676): 354–359. https://doi.org/10.1038/nature24270.
PubMed Web of Science ®Google Scholar
Stranieri, F., E. Fadda, and F. Stella. 2024, February. “Combining Deep Reinforcement Learning and Multi-Stage Stochastic Programming to Address the Supply Chain Inventory Management Problem.” International Journal of Production Economics 268:109099. https://doi.org/10.1016/j.ijpe.2023.109099.
Web of Science ®Google Scholar
Stranieri, F., and F. Stella. 2022. “A Deep Reinforcement Learning Approach to Supply Chain Inventory Management.” https://arxiv.org/abs/2204.09603.
Google Scholar
Sutton, R. S., and A. G. Barto. 2018. Reinforcement Learning: An Introduction. Cambridge: MIT Press.
Google Scholar
van Hezewijk, L., N. Dellaert, T. Van Woensel, and N. Gademann. 2022, April. “Using the Proximal Policy Optimisation Algorithm for Solving the Stochastic Capacitated Lot Sizing Problem.” International Journal of Production Research 61 (6): 1955–1978. https://doi.org/10.1080/00207543.2022.2056540.
Web of Science ®Google Scholar
Vinyals, O., I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, et al. 2019, October. “Grandmaster Level in StarCraft II Using Multi-Agent Reinforcement Learning.” Nature 575 (7782): 350–354. https://doi.org/10.1038/s41586-019-1724-z.
PubMed Web of Science ®Google Scholar
Wang, H., J. Tao, T. Peng, A. Brintrup, E. E. Kosasih, Y. Lu, R. Tang, and L. Hu. 2022, January. “Dynamic Inventory Replenishment Strategy for Aerospace Manufacturing Supply Chain: Combining Reinforcement Learning and Multi-Agent Simulation.” International Journal of Production Research60 (13): 4117–4136. https://doi.org/10.1080/00207543.2021.2020927.
Web of Science ®Google Scholar
Williams, R. J. 1992 May. “Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning.” Machine Learning 8 (3–4): 229–256. https://doi.org/10.1007/bf00992696.
Web of Science ®Google Scholar
Wu, C., A. Rajeswaran, Y. Duan, V. Kumar, A. M. Bayen, S. Kakade, I. Mordatch, and P. Abbeel. 2018. “Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines.” https://arxiv.org/abs/1803.07246.
Google Scholar
Yan, Y., A. H. Chow, C. P. Ho, Y.-H. Kuo, Q. Wu, and C. Ying. 2022, June. “Reinforcement Learning for Logistics and Supply Chain Management: Methodologies, State of the Art, and Future Opportunities.” Transportation Research Part E: Logistics and Transportation Review 162:102712. https://doi.org/10.1016/j.tre.2022.102712.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Performance of deep reinforcement learning algorithms in two-echelon inventory control systems

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Performance of deep reinforcement learning algorithms in two-echelon inventory control systems

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date