Regular Papers

International Journal of Control, Automation, and Systems 2024; 22(5): 1751-1759

https://doi.org/10.1007/s12555-022-1133-1

© The International Journal of Control, Automation, and Systems

Reinforcement Q-learning and Optimal Tracking Control of Unknown Discrete-time Multi-player Systems Based on Game Theory

Jin-Gang Zhao

Weifang University

Abstract

This paper studies the fully cooperative game tracking control problem (FCGTCP) for a class of discretetime multi-player linear systems with unknown dynamics. The reference trajectory is generated by a command generator system. An augmented multi-player systems composed of the origin multi-player systems and the command generator system is constructed, and an exponential discounted cost function is introduced to derive an augmented fully cooperative game tracking algebraic Riccati equation (FCGTARE). When the system dynamics are known, a model-based policy iteration (PI) algorithm is proposed to solve the augmented FCGTARE. Furthermore, to relax the system dynamics, an online reinforcement Q-learning algorithm is designed to obtain the solution to the augmented FCGTARE. The convergence of designed online reinforcement Q-learning algorithm is proved. Finally, two simulation examples are given to verify the validity of the model-based PI algorithm and online reinforcement Q-learning algorithm.

Keywords Discrete-time, fully cooperative game (FCG), multi-player systems, Q-learning, tracking control.

Article

Regular Papers

International Journal of Control, Automation, and Systems 2024; 22(5): 1751-1759

Published online May 1, 2024 https://doi.org/10.1007/s12555-022-1133-1

Copyright © The International Journal of Control, Automation, and Systems.

Reinforcement Q-learning and Optimal Tracking Control of Unknown Discrete-time Multi-player Systems Based on Game Theory

Jin-Gang Zhao

Weifang University

Abstract

This paper studies the fully cooperative game tracking control problem (FCGTCP) for a class of discretetime multi-player linear systems with unknown dynamics. The reference trajectory is generated by a command generator system. An augmented multi-player systems composed of the origin multi-player systems and the command generator system is constructed, and an exponential discounted cost function is introduced to derive an augmented fully cooperative game tracking algebraic Riccati equation (FCGTARE). When the system dynamics are known, a model-based policy iteration (PI) algorithm is proposed to solve the augmented FCGTARE. Furthermore, to relax the system dynamics, an online reinforcement Q-learning algorithm is designed to obtain the solution to the augmented FCGTARE. The convergence of designed online reinforcement Q-learning algorithm is proved. Finally, two simulation examples are given to verify the validity of the model-based PI algorithm and online reinforcement Q-learning algorithm.

Keywords: Discrete-time, fully cooperative game (FCG), multi-player systems, Q-learning, tracking control.

IJCAS
December 2024

Vol. 22, No. 12, pp. 3545~3811

Stats or Metrics

Share this article on

  • line

Related articles in IJCAS

IJCAS

eISSN 2005-4092
pISSN 1598-6446