Abstract
Contemporary organizations recognize the importance of lean and green production to realize ecological and economic benefits. Compared with the existing optimization methods, the multi-task multi-objective reinforcement learning (MT-MORL) offers an attractive means to address the dynamic, multi-target process-optimization problems associated with Energy-Flexible Machining (EFM). Despite the recent advances in reinforcement learning, the realization of an accurate Pareto frontier representation remains a major challenge. This article presents a generative manifold-based policy-search method to approximate the continuously distributed Pareto frontier for EFM optimization. To this end, multi-pass operations are formulated as part of a multi-policy Markov decision process, wherein the machining configurations witness dynamic changes. However, the traditional Gaussian distribution cannot accurately fit complex upper-level policies. Thus, a multi-layered generator was designed to map the high-dimensional policy manifold from a simple Gaussian distribution without performing complex calculations. Additionally, a hybrid multi-task training approach is proposed to handle the mode collapse and large task difference observed during the improvement of the generalization performance. Extensive computational testing and comparisons against existing baseline methods have been performed to demonstrate the improved Pareto frontier quality and computational efficiency of the proposed algorithm.
Additional information
Funding
Notes on contributors
Xiao Qinge
Qinge Xiao received PhD degree in Mechanical Engineering from Chongqing University, Chongqing, China, in 2019. She is currently a Postdoctoral Researcher at Shenzhen University with a specialization in machine learning techniques and intelligent optimization methods for sustainable and flexible production systems. She has authored or co-authored 20 technical papers in journals and conference proceedings.
Ben Niu
Ben Niu received his PhD degree from Shenyang Institute of Automation of the Chinese Academy of Sciences, Shenyang, China, in 2008. He had experiences of visiting at University of Edinburgh in 2018, Arizona State University in 2016 and Victoria University of Wellington in 2013. He is presently serving as a Guangdong Zhujiang Scholar Professor in Department of Management Science, Shenzhen University. He has published more than 150 papers in international Journals and international conferences. His main fields of research are Swarm Intelligent Optimization, Operation Research and their applications on Big Data Analysis, Manufacturing Intelligence, and Resource Optimization.
Chen Ying
Ying Chen received the PhD degree in Industrial Engineering from University of Texas at Arlington. He is currently an Assistant Professor in the School of Economics and Management at Harbin Institute of Technology. His research interests include data mining, machine learning, decision-making under uncertainty and optimization.