Quasi-online reinforcement learning for robots

Boudewijn Bakker, Viktor Zhumatiy, G. Gruener, Jürgen Schmidhuber

发表年份: 2006
引用次数: 32

摘要

This paper describes quasi-online reinforcement learning: while a robot is exploring its environment, in the background a probabilistic model of the environment is built on the fly as new experiences arrive; the policy is trained concurrently based on this model using an anytime algorithm. Prioritized sweeping, directed exploration, and transformed reward functions provide additional speed-ups. The robot quickly learns goal-directed policies from scratch, requiring few interactions with the environment and making efficient use of available computation time. From an outside perspective it learns the behavior online and in real time. We describe comparisons with standard methods and show the individual utility of each of the proposed techniques

关键词

Reinforcement learningComputer scienceRobotScratchProbabilistic logicPerspective (graphical)Artificial intelligenceOn the flyHuman–computer interactionComputation

Quasi-online reinforcement learning for robots

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory