TY - JOUR
T1 - Adaptive optimal process control with actor-critic design for energy-efficient batch machining subject to time-varying tool wear
AU - Xiao, Qinge
AU - Yang, Zhile
AU - Zhang, Yingfeng
AU - Zheng, Pai
N1 - Funding Information:
This work was supported in part by the Natural Science Basic Research Program of Shaanxi ( 2023-JC-JQ-39 ), and Natural Science Foundation of Guangdong Province ( 2020A1515110541 ).
Publisher Copyright:
© 2023 The Society of Manufacturing Engineers
PY - 2023/4
Y1 - 2023/4
N2 - Batch machining systems are essential for improving productivity and quality, but they consume considerable amounts of energy due to the continuous interaction with machine tools, workpieces, and cutting tools. In contrast to single-piece machining that has a short production cycle, the tool wear impacts in batch machining systems on energy consumption cannot be underestimated. However, few studies have focused on adaptive process control subject to time-varying tool wear because process optimization has always been previously considered a static problem. As an alternative to metaheuristic algorithms, reinforcement learning (RL) offers an attractive means for solving such a dynamic, high-dimensional, and high-coupling problem. In the case of turning cylindrical parts, an energy-efficient decision model is developed for the process control of pass operations of batch machining. The decision variables are decoupled by reformulating the problem as the Markov decision process, wherein the tool wear experiences dynamic changes. To solve the problem, an actor-critic RL framework with multi-constraint and multi-objective design is developed. Based on the framework, a dynamic process control method is proposed where the RL agent observes workpiece features, machining requirements, and tool wear states (inputs) and adaptively selects the control parameters such as cutting speed, feed rate, and cutting rate (outputs), with the aim to conserve energy. Two application tests and comparisons against metaheuristic methods are performed. The results indicate that the method can further reduce energy by over 20% compared with energy-efficient optimization ignoring tool wear effects. The learning efficiency of RL is about three times faster than that of metaheuristics. The online sampling time is less than 0.1 millisecond, which facilitates real-time control of process parameters.
AB - Batch machining systems are essential for improving productivity and quality, but they consume considerable amounts of energy due to the continuous interaction with machine tools, workpieces, and cutting tools. In contrast to single-piece machining that has a short production cycle, the tool wear impacts in batch machining systems on energy consumption cannot be underestimated. However, few studies have focused on adaptive process control subject to time-varying tool wear because process optimization has always been previously considered a static problem. As an alternative to metaheuristic algorithms, reinforcement learning (RL) offers an attractive means for solving such a dynamic, high-dimensional, and high-coupling problem. In the case of turning cylindrical parts, an energy-efficient decision model is developed for the process control of pass operations of batch machining. The decision variables are decoupled by reformulating the problem as the Markov decision process, wherein the tool wear experiences dynamic changes. To solve the problem, an actor-critic RL framework with multi-constraint and multi-objective design is developed. Based on the framework, a dynamic process control method is proposed where the RL agent observes workpiece features, machining requirements, and tool wear states (inputs) and adaptively selects the control parameters such as cutting speed, feed rate, and cutting rate (outputs), with the aim to conserve energy. Two application tests and comparisons against metaheuristic methods are performed. The results indicate that the method can further reduce energy by over 20% compared with energy-efficient optimization ignoring tool wear effects. The learning efficiency of RL is about three times faster than that of metaheuristics. The online sampling time is less than 0.1 millisecond, which facilitates real-time control of process parameters.
KW - Actor-critic learning
KW - Energy-efficient machining
KW - Optimal process control
KW - Tool wear
UR - http://www.scopus.com/inward/record.url?scp=85146435857&partnerID=8YFLogxK
U2 - 10.1016/j.jmsy.2023.01.005
DO - 10.1016/j.jmsy.2023.01.005
M3 - Journal article
AN - SCOPUS:85146435857
SN - 0278-6125
VL - 67
SP - 80
EP - 96
JO - Journal of Manufacturing Systems
JF - Journal of Manufacturing Systems
ER -