首页 /研究 /Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller
LEARNING

Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller

Yutaka Nakamura, Takeshi Mori, Yoichi Tokita, Tomohiro Shibata, Shin Ishii

发表年份
2005
引用次数
3

摘要

Referring to the mechanism of animals’ rhythmic movements, motor control schemes using a central pattern generator (CPG) controller have been studied. We previously proposed reinforcement learning (RL) called the CPG-actor-critic model, as an autonomous learning framework for a CPG controller. Here, we propose an off-policy natural policy gradient RL algorithm for the CPG-actor-critic model, to solve the “exploration-exploitation” problem by meta-controlling “behavior policy.” We apply this RL algorithm to an automatic control problem using a biped robot simulator. Computer simulation demonstrated that the CPG controller enables the biped robot to walk stably and efficiently based on our new algorithm.

关键词

Central pattern generatorController (irrigation)Reinforcement learningComputer scienceControl theory (sociology)Biped robotRobotCpG siteMechanism (biology)Control engineering

相关论文

查看 LEARNING 分类全部论文