首页 /研究 /Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller

LEARNING

Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller

Yutaka Nakamura, Takeshi Mori, Yoichi Tokita, Tomohiro Shibata, Shin Ishii

发表年份: 2005
引用次数: 3

摘要

Referring to the mechanism of animals’ rhythmic movements, motor control schemes using a central pattern generator (CPG) controller have been studied. We previously proposed reinforcement learning (RL) called the CPG-actor-critic model, as an autonomous learning framework for a CPG controller. Here, we propose an off-policy natural policy gradient RL algorithm for the CPG-actor-critic model, to solve the “exploration-exploitation” problem by meta-controlling “behavior policy.” We apply this RL algorithm to an automatic control problem using a biped robot simulator. Computer simulation demonstrated that the CPG controller enables the biped robot to walk stably and efficiently based on our new algorithm.

关键词

Central pattern generatorController (irrigation)Reinforcement learningComputer scienceControl theory (sociology)Biped robotRobotCpG siteMechanism (biology)Control engineering

Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory