International Journal of Control, Automation and Systems 2003; 1(3): 358-367
© The International Journal of Control, Automation, and Systems
We consider discrete-time factorial Markov Decision Processes (MDPs) in multiple decision-makers environment for infinite horizon average reward criterion with a general joint reward structure but a factorial joint state transition structure. We introduce the “localization” concept that a global MDP is localized for each agent such that each agent needs to consider a local MDP defined only with its own state and action spaces. Based on that, we present a gradient- ascent like iterative distributed algorithm that converges to a local optimal solution of the global MDP. The solution is an autonomous joint policy in that each agent’s decision is based on only its local state.
Keywords Distributed algorithm, local optimal solution, Markov decision process, multi-agent
International Journal of Control, Automation and Systems 2003; 1(3): 358-367
Published online September 1, 2003
Copyright © The International Journal of Control, Automation, and Systems.
Hyeong Soo Chang
We consider discrete-time factorial Markov Decision Processes (MDPs) in multiple decision-makers environment for infinite horizon average reward criterion with a general joint reward structure but a factorial joint state transition structure. We introduce the “localization” concept that a global MDP is localized for each agent such that each agent needs to consider a local MDP defined only with its own state and action spaces. Based on that, we present a gradient- ascent like iterative distributed algorithm that converges to a local optimal solution of the global MDP. The solution is an autonomous joint policy in that each agent’s decision is based on only its local state.
Keywords: Distributed algorithm, local optimal solution, Markov decision process, multi-agent
Vol. 23, No. 3, pp. 683~972
Donggil Lee and Yoonseob Lim*
International Journal of Control, Automation, and Systems 2025; 23(2): 664-673Karthi Ramachandran* and Jyh-Ching Juang
International Journal of Control, Automation, and Systems 2023; 21(9): 2821-2834Xiao-Dong Zhang, Shao-Shu Gao*, Wei-Xi Gao, Xu-Ying Wang, and Wei Zhang
International Journal of Control, Automation and Systems 2021; 19(5): 1882-1889