Design and implementation of general purpose reinforcement learning agents
Tyler Streeter
- Year
- 2005
- Citations
- 2
Abstract
Intelligent agents are becoming increasingly important in our society. We currently have house cleaning robots, computer-controlled opponents in video games, unmanned aerial combat vehicles, entertainment robots, and autonomous explorers in outer space. But there are many problems with the current generation of intelligent agents. Most of these problems stem from the fact that they are designed for very specific problems. Each intelligent agent has limited adaptability to new tasks; if conditions change slightly, the agent may quickly become confused. Additionally, a huge engineering effort is required to design an agent for each new task. Ideally, we would have a reusable general purpose agent design. Such a general purpose agent would be able to adapt to changing environments and would be easy to train to handle new tasks. To implement this agent design, we can use ideas from the field of reinforcement learning, an approach with strong mathematical foundations and intriguing biological implications. The available reinforcement learning algorithms are powerful because of their generality: agents simply receive a scalar reward value representing success or failure. Additionally, these algorithms can be combined with other powerful ideas (e.g. planning from a learned internal model). This thesis provides a step towards the goal of general purpose agents. It discusses a detailed agent design and provides a concrete software implementation of these ideas. It covers the components necessary for such a general purpose agent, starting with a minimal design and proceeding to develop a more powerful learning architecture. The final design uses temporal difference learning, radial basis functions, planning, uncertainty estimations, and curiosity. The main contributions of this thesis are: a novel combination of temporal difference learning with planning, uncertainty, and curiosity; a discussion of correlations between theoretical reinforcement learning and reward processing in biological brains; a practical Open Source implementation of general purpose reinforcement learning agents; and experimental results showing learning performance on several tasks, including two physical control problems.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002