International Journal of Control, Automation and Systems 2003; 1(1): 142-148
© The International Journal of Control, Automation, and Systems
Reinforcement learning is considered as an important tool for robotic learning in unknown/uncertain environments. In this paper, we propose an evaluation function expressed in a vector form to realize multi-dimensional reinforcement learning. The novel feature of the proposed method is that learning one behavior induces parallel learning of other behaviors though the objectives of each behavior are different. In brief, all behaviors watch other behaviors from a critical point of view. Therefore, in the proposed method, there is cross-criticism and parallel learning that make the multi-dimensional learning process more efficient. By applying the proposed learning method, we carried out multi-dimensional evaluation (reward) and multi-dimensional learning simultaneously in one trial. A special neural network (Q-net), in which the weights and the output are represented by vectors, is proposed to realize a critic network for Q-learning. The proposed learning method is applied for behavior planning of mobile robots.
Keywords Reinforcement learning, Q-learning, multi-dimensional evaluation, neural networks, intelligent robot
International Journal of Control, Automation and Systems 2003; 1(1): 142-148
Published online March 1, 2003
Copyright © The International Journal of Control, Automation, and Systems.
Kazuo Kiguchi/Thrishantha Nanayakkara/Keigo Watanabe/Toshio Fukuda
Reinforcement learning is considered as an important tool for robotic learning in unknown/uncertain environments. In this paper, we propose an evaluation function expressed in a vector form to realize multi-dimensional reinforcement learning. The novel feature of the proposed method is that learning one behavior induces parallel learning of other behaviors though the objectives of each behavior are different. In brief, all behaviors watch other behaviors from a critical point of view. Therefore, in the proposed method, there is cross-criticism and parallel learning that make the multi-dimensional learning process more efficient. By applying the proposed learning method, we carried out multi-dimensional evaluation (reward) and multi-dimensional learning simultaneously in one trial. A special neural network (Q-net), in which the weights and the output are represented by vectors, is proposed to realize a critic network for Q-learning. The proposed learning method is applied for behavior planning of mobile robots.
Keywords: Reinforcement learning, Q-learning, multi-dimensional evaluation, neural networks, intelligent robot
Vol. 22, No. 9, pp. 2673~2953
Tian Xu* and Yuxiang Wu
International Journal of Control, Automation, and Systems 2024; 22(7): 2108-2121Jin-Gang Zhao
International Journal of Control, Automation, and Systems 2024; 22(5): 1751-1759Yaqi Li, Yun Chen*, and Shuangcheng Sun
International Journal of Control, Automation, and Systems 2024; 22(3): 927-935