2,183
Views
21
CrossRef citations to date
0
Altmetric
Research Article

Explainable reinforcement learning in production control of job shop manufacturing system

ORCID Icon, ORCID Icon, &
Pages 5812-5834 | Received 17 Feb 2021, Accepted 17 Aug 2021, Published online: 13 Sep 2021

References

  • Arel, I., C. Liu, T. Urbanik, and A. G. Kohls. 2010. “Reinforcement Learning-Based Multi-Agent System for Network Traffic Signal Control.” IET Intelligent Transport Systems 4 (2): 128.
  • Aydin, M. Emin, and Ercan Öztemel. 2000. “Dynamic Job-Shop Scheduling using Reinforcement Learning Agents.” Robotics and Autonomous Systems 33 (2–3): 169–178.
  • Bach, Sebastian, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. “On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation.” PloS One 10 (7): e0130140.
  • Baker, Bowen, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell, Bob McGrew, and Igor Mordatch. 2019. “Emergent Tool Use from Multi-Agent Autocurricula.” http://arxiv.org/pdf/1909.07528v2.
  • Blazewicz, Jacek, Jan Karel Lenstra, and A. H. G. Rinnooy Kan. 1983. “Scheduling Subject to Resource Constraints: Classification and Complexity.” Discrete Applied Mathematics 5 (1): 11–24.
  • Brittain, Marc, and Peng Wei. 2019. “Autonomous Air Traffic Controller: A Deep Multi-Agent Reinforcement Learning Approach.” Preprint. arXiv:1905.01303.
  • Ciosek, Kamil, Quan Vuong, Robert Loftin, and Katja Hofmann. 2019. “Better Exploration with Optimistic Actor-Critic.” NeurIPS 2019. http://arxiv.org/pdf/1910.12807v1.
  • Colledani, Marcello, Tullio Tolio, Anath Fischer, Benoit Iung, Gisela Lanza, Robert Schmitt, and József Váncza. 2014. “Design and Management of Manufacturing Systems for Production Quality.” CIRP Annals 63 (2): 773–796.
  • Dewey, Daniel. 2014. “Reinforcement Learning and the Reward Engineering Principle.” In 2014 AAAI Spring Symposium Series, Palo Alto, 24.03. - 26.03.2014, edited by Carol M. Hamilton, 13–16. Palo Alto: AAAI Press. https://www.aaai.org/ocs/index.php/SSS/SSS14/paper/viewPaper/7704.
  • Dulac-Arnold, Gabriel, Daniel Mankowitz, and Todd Hester. 2019. “Challenges of Real-World Reinforcement Learning.” Preprint. arXiv:1904.12901.
  • Ennen, Philipp, Sebastian Reuter, Rene Vossen, and Sabina Jeschke. 2016. “Automated Production Ramp-Up Through Self-Learning Systems.” Procedia CIRP 51: 57–62.
  • Gabel, Thomas, and Martin Riedmiller. 2008. “Adaptive Reactive Job-Shop Scheduling with Reinforcement Learning Agents.” International Journal of Information Technology and Intelligent Computing 24 (4): 14–18.
  • Gabel, Thomas, and Martin Riedmiller. 2012. “Distributed Policy Search Reinforcement Learning for Job-Shop Scheduling Tasks.” International Journal of Production Research 50 (1): 41–61. http://www.tandfonline.com/doi/abs/10.1080/00207543.2011.571443.
  • Ganin, Yaroslav, Tejas Kulkarni, Igor Babuschkin, S. M. Ali Eslami, and Oriol Vinyals. 2018. “Synthesizing Programs for Images using Reinforced Adversarial Learning.” http://arxiv.org/pdf/1804.01118v1.
  • Gläscher, Jan, Nathaniel Daw, Peter Dayan, and John P O'Doherty. 2010. “States Versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning.” Neuron 66 (4): 585–595.
  • Graham, R. L., E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan. 1979. “Optimization and Approximation in Deterministic Sequencing and Scheduling: A Survey.” Annals of Discrete Mathematics5 (C): 287–326.
  • Graves, Stephen C. 1981. “A Review of Production Scheduling.” Operations Research 29 (4): 646–675. https://doi.org/10.1287/opre.29.4.646.
  • Greschke, P., M. Schönemann, S. Thiede, and C. Herrmann. 2014. “Matrix Structures for High Volumes and Flexibility in Production Systems.” Procedia CIRP 17: 160–165.
  • Greydanus, Sam, Anurag Koul, Jonathan Dodge, and Alan Fern. 2017. “Visualizing and Understanding Atari Agents.” http://arxiv.org/pdf/1711.00138v5.
  • Hausknecht, Matthew, and Peter Stone. 2015. “Deep Recurrent Q-Learning for Partially Observable MDPs.” http://arxiv.org/pdf/1507.06527v4.
  • He, Sen, and Nicolas Pugeault. 2018. “Deep Saliency: What is Learnt by a Deep Network about Saliency?” Preprint. arXiv:1801.04261.
  • Hsu, Chi-Hung, Shu-Huan Chang, Jhao-Hong Liang, Hsin-Ping Chou, Chun-Hao Liu, Shih-Chieh Chang, Jia-Yu Pan, Yu-Ting Chen, Wei Wei, and Da-Cheng Juan. 2018. “Monas: Multi-Objective Neural Architecture Search using Reinforcement Learning.” Preprint. arXiv:1806.10332.
  • Jaques, Natasha, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, D. J. Strouse, Joel Z. Leibo, and Nando de Freitas. 2018. “Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning.” http://arxiv.org/pdf/1810.08647v4.
  • Jaunet, Theo, Romain Vuillemot, and Christian Wolf. 2019. “DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning.” http://arxiv.org/pdf/1909.02982v1.
  • Kober, Jens, and Jan Peters, eds. 2014. Learning Motor Skills. Springer Tracts in Advanced Robotics. Cham: Springer International Publishing.
  • Kuhnle, Andreas. 2020. “Adaptive Order Dispatching based on Reinforcement Learning: Application in a Complex Job Shop in the Semiconductor Industry.” PhD diss., Karslruhe Institute of Technology (KIT), Karlsruhe.
  • Kuhnle, Andreas, Jan-Philipp Kaiser, Felix Theiß, Nicole Stricker, and Gisela Lanza. 2021. “Designing an Adaptive Production Control System Using Reinforcement Learning.” Journal of Intelligent Manufacturing 32: 855–876.
  • Kuhnle, Alexander, Michael Schaarschmidt, and Kai Fricke. 2017. “Tensorforce: A TensorFlow Library for Applied Reinforcement Learning.” https://github.com/tensorforce/tensorforce.
  • Kuhnle, Andreas, Louis Schäfer, Nicole Stricker, and Gisela Lanza. 2019. “Design, Implementation and Evaluation of Reinforcement Learning for an Adaptive Order Dispatching in Job Shop Manufacturing Systems.” Procedia CIRP 81: 234–239.
  • Küpper, Daniel, Christoph Sieben, Kristian Kuhlmann, Yew Lim, and Justin Ahmad. 2018. Will Flexible-Cell Manufacturing Revolutionize Carmaking? Cologne: Boston Consulting Group. https://www.bcg.com/de-de/publications/2018/flexible-cell-manufacturing-revolutionize-carmaking.aspx.
  • Lasi, Heiner, Peter Fettke, Hans-Georg Kemper, Thomas Feld, and Michael Hoffmann. 2014. “Industry 4.0.” Business & Information Systems Engineering 6 (4): 239–242.
  • Law, Averill M.. 2014. Simulation Modeling and Analysis. 5th ed. New York: McGraw-Hill Education -- Europe.
  • Levine, Sergey, Chelsea Finn, Trevor Darrell, and Pieter Abbeel. 2015. “End-to-End Training of Deep Visuomotor Policies.” http://arxiv.org/pdf/1504.00702v5.
  • Li, Sheng, Maxim Egorov, and Mykel Kochenderfer. 2019. “Optimizing Collision Avoidance in Dense Airspace using Deep Reinforcement Learning.” http://arxiv.org/pdf/1912.10146v1.
  • Li, Jiwei, Will Monroe, and Dan Jurafsky. 2017. “Understanding Neural Networks through Representation Erasure.” http://arxiv.org/pdf/1612.08220v3.
  • Little, John D. C.. 1961. “A Proof for the Queuing Formula: L=λW.” Operations Research 9 (3): 383–387. http://pubsonline.informs.org/doi/abs/10.1287/opre.9.3.383.
  • Madumal, Prashan, Tim Miller, Liz Sonenberg, and Frank Vetere. 2019. “Explainable Reinforcement Learning through a Causal Lens.” http://arxiv.org/pdf/1905.10958v2.
  • Mahadevan, Sridhar, and Georgios Theocharous. 1998. “Optimizing Production Manufacturing using Reinforcement Learning.” In Proceedings of the 11th International Florida Artificial Intelligence Research Society Conference, Sanibel Island, 18.05. - 20.05.1998, edited by D. J. Cook, 372–377. Palo Alto: AAAI Press.
  • Mao, Hongzi, Mohammad Alizadeh, Ishai Menache, and Srikanth Kandula. 2016. “Resource Management with Deep Reinforcement Learning.” In Proceedings of the 15th ACM Workshop on Hot Topics in Networks - HotNets '16, edited by Bryan Ford, Alex C. Snoeren, and Ellen Zegura, 50–56. New York, NY: ACM Press.
  • Mavrin, Borislav, Shangtong Zhang, Hengshuai Yao, Linglong Kong, Kaiwen Wu, and Yaoliang Yu. 2019. “Distributional Reinforcement Learning for Efficient Exploration.” ICML. http://arxiv.org/pdf/1905.06125v1.
  • May, Marvin Carl, Lars Kiefer, Andreas Kuhnle, Nicole Stricker, and Gisela Lanza. 2021a. “Decentralized Multi-Agent Production Control through Economic Model Bidding for Matrix Production Systems.” Procedia CIRP 96: 3–8.
  • May, Marvin Carl, Leonard Overbeck, Marco Wurster, Andreas Kuhnle, and Gisela Lanza. 2021b. “Foresighted Digital Twin for Situational Agent Selection in Production Control.” Procedia CIRP 99: 27–32.
  • May, Marvin Carl, Simon Schmidt, Andreas Kuhnle, Nicole Stricker, and Gisela Lanza. 2021c. “Product Generation Module: Automated Production Planning for Optimized Workload and Increased Efficiency in Matrix Production Systems.” Procedia CIRP 96: 45–50.
  • Mayer, Sebastian, Dennis Gankin, Christian Arnet, and Christian Endisch. 2019. “Adaptive Production Control with Negotiating Agents in Modular Assembly Systems.” In 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), 120–127. IEEE.
  • Mendonca, Russell, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, and Chelsea Finn. 2019. “Guided Meta-Policy Search.” http://arxiv.org/pdf/1904.00956v1.
  • Mes, Martijn, Matthieu Van Der Heijden, and Aart Van Harten. 2007. “Comparison of Agent-Based Scheduling to Look-Ahead Heuristics for Real-Time Transportation Problems.” European Journal of Operational Research 181 (1): 59–75.
  • Meurer, Aaron, Christopher P. Smith, Mateusz Paprocki, Ondřej Čertík, Sergey B. Kirpichev, Matthew Rocklin, and Amit Kumar, et al. 2017. “SymPy: Symbolic Computing in Python.” PeerJ Computer Science 3: e103. https://doi.org/10.7717/peerj-cs.103.
  • Minguillon, Fabio Echsler, and Gisela Lanza. 2019. “Coupling of Centralized and Decentralized Scheduling for Robust Production in Agile Production Systems.” Procedia CIRP 79: 385–390.
  • Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. “Playing Atari with Deep Reinforcement Learning.” 1312 (5602): 1–9. http://arxiv.org/abs/1312.5602.
  • Mönch, L., J. W. Fowler, and S. J. Mason. 2013. Production Planning and Control for Wafer Fabrication Facilities: Modeling, Analysis, and Systems. vol. 10, 978–1.New York: Springer.
  • Morch, N. J. S., Ulrik Kjems, Lars Kai Hansen, C. Svarer, I. Law, B. Lautrup, S. Strother, and K. Rehm. 1995. “Visualization of Neural Networks using Saliency Maps.” In Proceedings of ICNN'95-International Conference on Neural Networks, vol. 4, 2085–2090. IEEE.
  • Ng, Andrew, D. Harada, and Stuart Russell. 1999. “Policy Invariance under Rreward Transformations: Theory and Application to Reward Shaping.” In Proceedings of the Sixteenth International Conference on Machine Learning, Bled, 27.06. - 30.06.1999, edited by Ivan Bratko and S. Dzeroski, 278–287. San Francisco: Morgan Kaufmann Publishers Inc. http://luthuli.cs.uiuc.edu/daf/courses/games/AIpapers/ng99policy.pdf.
  • Ou, Xinyan, Qing Chang, and Nilanjan Chakraborty. 2019. “Simulation Study on Reward Function of Reinforcement Learning in Gantry Work Cell Scheduling.” Journal of Manufacturing Systems 50: 1–8. doi:10.1016/j.jmsy.2018.11.005. https://linkinghub.elsevier.com/retrieve/pii/S0278612518304503.
  • Rakelly, Kate, Aurick Zhou, Deirdre Quillen, Chelsea Finn, and Sergey Levine. 2019. “Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables.” http://arxiv.org/pdf/1903.08254v1.
  • Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16, edited by Balaji Krishnapuram, Mohak Shah, Alex Smola, Charu Aggarwal, Dou Shen, and Rajeev Rastogi, 1135–1144. New York, NY: ACM Press.
  • Riedmiller, Simone, and Martin Riedmiller. 1999. “A Neural Reinforcement Learning Approach to Learn Local Dispatching Policies in Production Scheduling.” In Proceedings of the 16th International Joint Conference on Artificial Intelligence, Stockholm, 31.07. - 06.08.1999, edited by Thomas Dean, 764–769. San Francisco: Morgan Kaufmann Publishers Inc.
  • Russell, Stuart J., and Peter Norvig. 2016. Artificial Intelligence: A Modern Approach. 3rd ed. Always Learning. Boston: Pearson.
  • Samek, Wojciech, Grégoire Montavon, Andrea Vedaldi, Lars Kai Hansen, and Klaus-Robert Müller, eds. 2019. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Lecture Notes in Computer Science. Cham: Springer International Publishing.
  • Schönemann, Malte, Christoph Herrmann, Peter Greschke, and Sebastian Thiede. 2015. “Simulation of Matrix-structured Manufacturing Systems.” Journal of Manufacturing Systems 37 (1): 104–112.
  • Schulman, John, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. “Trust Region Policy Optimization.” In International Conference on Machine Learning, 1889–1897.
  • Schulman, John, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. “Proximal Policy Optimization Algorithms.” Preprint. arXiv:1707.06347.
  • Selvaraju, Ramprasaath R., Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. “Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization.” In 2017 IEEE International Conference on Computer Vision (ICCV), 618–626. http://arxiv.org/pdf/1610.02391v4.
  • Silver, David, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, and Marc Lanctot, et al. 2018. “A General Reinforcement Learning Algorithm that Masters Chess, Shogi, and Go Through Self-play.” Science (New York, N.Y.) 362 (6419): 1140–1144.
  • Silver, David, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, and Thomas Hubert, et al. 2017. “Mastering the Game of Go Without Human Knowledge.” Nature550 (7676): 354–359. doi:10.1038/nature24270. http://www.nature.com/articles/nature24270.
  • Simonyan, Karen, Andrea Vedaldi, and Andrew Zisserman. 2013. “Deep Inside Convolutional Nnetworks: Visualising Image Classification Models and Saliency Maps.” Preprint. arXiv:1312.6034.
  • Stricker, Nicole, Andreas Kuhnle, Roland Sturm, and Simon Friess. 2018. “CIRP Annals -- Manufacturing Technology Reinforcement Learning for Adaptive Order Dispatching in the Semiconductor Industry.” CIRP Annals -- Manufacturing Technology 67 (1): 511–514.
  • Sutton, Richard S., and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. 2nd ed.Cambridge, MA: MIT Press.
  • Tedrake, R., T. W. Zhang, and H. S. Seung. 2004. “Stochastic Policy Gradient Reinforcement Learning on a Simple 3D Biped.” In 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), 2849–2854. IEEE.
  • Vinyals, Oriol, Igor Babuschkin, Wojciech M. Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H. Choi, et al. 2019. “Grandmaster Level in StarCraft II Using Multi-agent Reinforcement Learning.” Nature 575 (7782): 350–354.
  • Wang, Jiao, Xueping Li, and Xiaoyan Zhu. 2012. “Intelligent Dynamic Control of Stochastic Economic Lot Scheduling by Agent-Based Reinforcement Learning.” International Journal of Production Research50 (16): 4381–4395.
  • Waschneck, Bernd, Thomas Altenmüller, Thomas Bauernhansl, Alexander Knapp, Andreas Kyek, Andre Reichstaller, Lenz Belzner, et al. 2018. “Deep Reinforcement Learning for Semiconductor Production Scheduling.” In 2018 29th Annual SEMI Advanced Semiconductor Manufacturing Conference (ASMC 2018), Saratoga Springs, NY, 30.04. - 03.05.2018, edited by A. Roussy and R. Van Roijen, 301–306. Piscataway: IEEE. https://ieeexplore.ieee.org/document/8373191/.
  • Waschneck, Bernd, Thomas Altenmüller, Thomas Bauernhansl, and Andreas Kyek. 2016. “Production Scheduling in Complex Job Shops from an Industrie 4.0 Perspective: A Review and Challenges in the Semiconductor Industry.” Graz.
  • Wiendahl, H.-P., Hoda A. ElMaraghy, Peter Nyhuis, Michael F. Zäh, H.-H. Wiendahl, Neil Duffie, and Michael Brieke. 2007. “Changeable Manufacturing-Classification, Design and Operation.” CIRP Annals 56 (2): 783–809.
  • Wu, Jun, Xin Xu, Pengcheng Zhang, and Chunming Liu. 2011. “A Novel Multi-Agent Reinforcement Learning Approach for Job Scheduling in Grid Computing.” Future Generation Computer Systems 27 (5): 430–439.
  • Yao, Mariya. 2019. “Breakthrough Research In Reinforcement Learning From 2019.” Accessed May 5 2020. https://www.topbots.com/top-ai-reinforcement-learning-research-papers-2019/.
  • Zhang, Wei, and Thomas G. Dietterich. 1995. “A Reinforcement Learning Approach to Jjob-Shop Scheduling.” In IJCAI, vol. 95, 1114–1120. Citeseer.
  • Zhang, Zhicong, Li Zheng, Forest Hou, and Na Li. 2011. “Semiconductor Final Test Scheduling with Sarsa(λ, K) Algorithm.” European Journal of Operational Research 215 (2): 446–458.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.