International Journal of Control, Automation, and Systems 2024; 22(10): 3068-3082
https://doi.org/10.1007/s12555-023-0616-z
© The International Journal of Control, Automation, and Systems
In industrial control, PID controllers are widely used, but their control performance depends heavily on the parameters of the controller. However, the adjustment of PID controller parameters is cumbersome and inefficient. Recently, deep reinforcement learning has been gradually introduced into the industrial control field due to its advantage of being able to learn autonomously by interacting with the environment. In this paper, a PID parameter optimization method based on TD3 algorithm of dynamic classification replay buffer (DCRB-TD3) is proposed. By designing the optimization framework, the optimization process of the PID parameters is converted into the learning process of the weights of the actor network. In order to improve the learning efficiency of the reinforcement learning algorithm, avoid the phenomenon of control curve dispersion and ensure that the whole process can be continuously closed-loop optimized. In this paper, the regular TD3 algorithm is improved, a dynamic classification ratio strategy is designed, and a sampling update method for dynamic classification experience replay is proposed. Finally, simulations are performed on various systems, and DCRB-TD3 is compared with the PID parameter optimization method based on the PSO algorithm. The results show that the PID parameters optimized by DCRB-TD3 have better control performance than other methods.
Keywords Deep reinforcement learning, intelligent optimization, PID parameter optimization, twin delayed deep deterministic policy gradient (TD3).
International Journal of Control, Automation, and Systems 2024; 22(10): 3068-3082
Published online October 1, 2024 https://doi.org/10.1007/s12555-023-0616-z
Copyright © The International Journal of Control, Automation, and Systems.
Haojun Zhong and Zhenlei Wang*
East China University of Science and Technology
In industrial control, PID controllers are widely used, but their control performance depends heavily on the parameters of the controller. However, the adjustment of PID controller parameters is cumbersome and inefficient. Recently, deep reinforcement learning has been gradually introduced into the industrial control field due to its advantage of being able to learn autonomously by interacting with the environment. In this paper, a PID parameter optimization method based on TD3 algorithm of dynamic classification replay buffer (DCRB-TD3) is proposed. By designing the optimization framework, the optimization process of the PID parameters is converted into the learning process of the weights of the actor network. In order to improve the learning efficiency of the reinforcement learning algorithm, avoid the phenomenon of control curve dispersion and ensure that the whole process can be continuously closed-loop optimized. In this paper, the regular TD3 algorithm is improved, a dynamic classification ratio strategy is designed, and a sampling update method for dynamic classification experience replay is proposed. Finally, simulations are performed on various systems, and DCRB-TD3 is compared with the PID parameter optimization method based on the PSO algorithm. The results show that the PID parameters optimized by DCRB-TD3 have better control performance than other methods.
Keywords: Deep reinforcement learning, intelligent optimization, PID parameter optimization, twin delayed deep deterministic policy gradient (TD3).
Vol. 22, No. 10, pp. 2955~3252
Bin Jiang*, Yajie Ma, Lijun Chen, Binda Huang, Yuying Huang, and Li Guan
International Journal of Control, Automation, and Systems 2023; 21(10): 3127-3150Cheol-Hui Min and Jae-Bok Song*
International Journal of Control, Automation and Systems 2022; 20(10): 3296-3311Yanglong Liu, Zuguo Chen*, Yonggang Li, Ming Lu, Chaoyang Chen, and Xuzhuo Zhang
International Journal of Control, Automation and Systems 2022; 20(8): 2669-2680