Model-Free reinforcement learning with continuous action in practice

Thomas Degris, Patrick M. Pilarski, Richard S. Sutton

发表年份: 2012
引用次数: 234

摘要

Reinforcement learning methods are often considered as a potential solution to enable a robot to adapt to changes in real time to an unpredictable environment. However, with continuous action, only a few existing algorithms are practical for real-time learning. In such a setting, most effective methods have used a parameterized policy structure, often with a separate parameterized value function. The goal of this paper is to assess such actor-critic methods to form a fully specified practical algorithm. Our specific contributions include 1) developing the extension of existing incremental policy-gradient algorithms to use eligibility traces, 2) an empirical comparison of the resulting algorithms using continuous actions, 3) the evaluation of a gradient-scaling technique that can significantly improve performance. Finally, we apply our actor-critic algorithm to learn on a robotic platform with a fast sensorimotor cycle (10ms). Overall, these results constitute an important step towards practical real-time learning control with continuous action.

关键词

Reinforcement learningParameterized complexityComputer scienceAction (physics)Artificial intelligenceFunction (biology)Control (management)RobotExtension (predicate logic)Machine learning

Model-Free reinforcement learning with continuous action in practice

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory