首页 /研究 /Multi Criteria Reinforcement Learning Based on Goal-directed Exploration and its Application to Bipedal Walking Robot

LOCOMOTION

Multi Criteria Reinforcement Learning Based on Goal-directed Exploration and its Application to Bipedal Walking Robot

K. Roger Aoki, Jun Sakuma, Takanobu Asai, Kokolo Ikeda, Shigenobu Kobayashi

发表年份: 2005
引用次数: 9
访问权限: 开放获取

摘要

An effective method of acquiring a complex control policy is requested concerning real systems and real robots in recent years. There are a lot of researches using the reinforcement learning, because the reinforcement learning is an important element technology. In the reinforcement learning, a scalar evaluation of control that is called a reward is set to obtain a desirable behavior. However, the reward is often given as the vector at a complex system control problem. For this case, when the reinforcement learning applies, the method of making the rewards a scalar by the linearly weighted sum, etc. has been adopted. In this paper, we explain that such scalar method is not appropriate. We adopt a framework of multi-criteria reinforcement learning in the handling of the vector of the rewards and the related value functions. In this case, we cannot use the action selection strategy like the ε-greedy strategy adopted in general. Therefore, we show the necessity and importance of the decision-making strategy in the multi-criteria reinforcement learning. We propose the decision-making strategy of selecting effective action candidates by the α-domination strategy and using goal-directed bias based on the achievement level of each evaluation. We apply the proposed method to the walking control problem of the humanoid robot. The physical simulation results show that our method can improve the walking control efficiently.

关键词

Reinforcement learningComputer scienceArtificial intelligenceQ-learningRobotAction selectionMachine learningSet (abstract data type)Robot learningControl (management)

Multi Criteria Reinforcement Learning Based on Goal-directed Exploration and its Application to Bipedal Walking Robot

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory