TEXPLORE: Real-Time Sample-Efficient Reinforcement Learning for Robots
Todd Hester, Peter Stone
- 发表年份
- 2012
- 引用次数
- 3
摘要
Reinforcement Learning (RL) is a paradigm for learning decision-making tasks that could enable robots to learn and adapt to situations on-line. For an RL algorithm to be practical for robotic control tasks, it must learn in very few samples, while continually taking actions in real-time. In addition, the algorithm must learn efficiently in the face of noise, sensor/actuator delays and continuous state features. In this paper, we describe TEXPLORE, a model-based RL method that addresses these issues. It learns a random forest model of the domain which generalizes dynamics to unseen states. The agent targets exploration on states that are both promising for the final policy and uncertain in the model. With sample-based planning and a novel parallel architecture, TEXPLORE can select actions continually in real-time whenever necessary. We empirically evaluate TEXPLORE learning to control the velocity of an autonomous vehicle in real-time.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002