Regular Papers

International Journal of Control, Automation and Systems 2003; 1(3): 358-367

© The International Journal of Control, Automation, and Systems

Localization and a Distributed Local Optimal Solution Algorithm for a Class of Multi-Agent Markov Decision Processes

Hyeong Soo Chang

Abstract

We consider discrete-time factorial Markov Decision Processes (MDPs) in multiple decision-makers environment for infinite horizon average reward criterion with a general joint reward structure but a factorial joint state transition structure. We introduce the “localization” concept that a global MDP is localized for each agent such that each agent needs to consider a local MDP defined only with its own state and action spaces. Based on that, we present a gradient- ascent like iterative distributed algorithm that converges to a local optimal solution of the global MDP. The solution is an autonomous joint policy in that each agent’s decision is based on only its local state.

Keywords Distributed algorithm, local optimal solution, Markov decision process, multi-agent

Article

Regular Papers

International Journal of Control, Automation and Systems 2003; 1(3): 358-367

Published online September 1, 2003

Copyright © The International Journal of Control, Automation, and Systems.

Localization and a Distributed Local Optimal Solution Algorithm for a Class of Multi-Agent Markov Decision Processes

Hyeong Soo Chang

Abstract

We consider discrete-time factorial Markov Decision Processes (MDPs) in multiple decision-makers environment for infinite horizon average reward criterion with a general joint reward structure but a factorial joint state transition structure. We introduce the “localization” concept that a global MDP is localized for each agent such that each agent needs to consider a local MDP defined only with its own state and action spaces. Based on that, we present a gradient- ascent like iterative distributed algorithm that converges to a local optimal solution of the global MDP. The solution is an autonomous joint policy in that each agent’s decision is based on only its local state.

Keywords: Distributed algorithm, local optimal solution, Markov decision process, multi-agent

IJCAS
March 2025

Vol. 23, No. 3, pp. 683~972

Stats or Metrics

Share this article on

  • line

Related articles in IJCAS

IJCAS

eISSN 2005-4092
pISSN 1598-6446