International Journal of Control, Automation and Systems 2015; 13(1): 99-109
Published online December 18, 2014
https://doi.org/10.1007/s12555-014-0085-5
© The International Journal of Control, Automation, and Systems
This paper develops a concurrent learning-based approximate dynamic programming (ADP) algorithm for solving the two-player zero-sum (ZS) game arising in H∞ control of continuous-time (CT) systems with unknown nonlinear dynamics. First, the H∞ control is formulated as a ZS game and then, an online algorithm is developed that learns the solution to the Hamilton-Jacobi-Isaacs (HJI) equation without using any knowledge on the system dynamics. This is achieved by using a neural network (NN) identifier to approximate the uncertain system dynamics. The algorithm is implemented on actor-critic-disturbance NN structure along with the NN identifier to approximate the optimal value function and the corresponding Nash solution of the game. All NNs are tuned at the same time. By us-ing the idea of concurrent learning the need to check for the persistency of excitation condition is re-laxed to simplified condition. The stability of the overall system is guaranteed and the convergence to the Nash solution of the game is shown. Simulation results show the effectiveness of the algorithm.
Keywords Approximate dynamic programming, concurrent learning, H∞ control, neural networks, two-player zero-sum game, unknown dynamics.
International Journal of Control, Automation and Systems 2015; 13(1): 99-109
Published online February 1, 2015 https://doi.org/10.1007/s12555-014-0085-5
Copyright © The International Journal of Control, Automation, and Systems.
Sholeh Yasini, Mohammad Bagher Naghibi Sistani*, and Ali Karimpour
Ferdowsi University of Mashhad
This paper develops a concurrent learning-based approximate dynamic programming (ADP) algorithm for solving the two-player zero-sum (ZS) game arising in H∞ control of continuous-time (CT) systems with unknown nonlinear dynamics. First, the H∞ control is formulated as a ZS game and then, an online algorithm is developed that learns the solution to the Hamilton-Jacobi-Isaacs (HJI) equation without using any knowledge on the system dynamics. This is achieved by using a neural network (NN) identifier to approximate the uncertain system dynamics. The algorithm is implemented on actor-critic-disturbance NN structure along with the NN identifier to approximate the optimal value function and the corresponding Nash solution of the game. All NNs are tuned at the same time. By us-ing the idea of concurrent learning the need to check for the persistency of excitation condition is re-laxed to simplified condition. The stability of the overall system is guaranteed and the convergence to the Nash solution of the game is shown. Simulation results show the effectiveness of the algorithm.
Keywords: Approximate dynamic programming, concurrent learning, H∞ control, neural networks, two-player zero-sum game, unknown dynamics.
Vol. 22, No. 9, pp. 2673~2953
Yongfeng Lv*, Jun Zhao, Baixue Miao, Huimin Chang, and Xuemei Ren
International Journal of Control, Automation, and Systems 2024; 22(9): 2686-2698Tian Xu* and Yuxiang Wu
International Journal of Control, Automation, and Systems 2024; 22(7): 2108-2121Yaqi Li, Yun Chen*, and Shuangcheng Sun
International Journal of Control, Automation, and Systems 2024; 22(3): 927-935