Reinforcement Learning and Approximate Dynamic Programming (RLADP)—Foundations, Common Misconceptions, and the Challenges Ahead
Paul J. Werbos
- Year
- 2012
- Citations
- 29
Abstract
Many new formulations of reinforcement learning and approximate dynamic programming (RLADP) have appeared in recent years, as it has grown in control applications, control theory, operations research, computer science, robotics, and efforts to understand brain intelligence. This chapter reviews the foundations and challenges common to all these areas, in a unified way but with reference to their variations. It explains the basic tools–Bellman Equation, and Value and Policy Functions. The chapter highlights cases where experience in one area sheds light on obstacles or common misconceptions in another. Many common beliefs about the limits of RLADP are based on such obstacles and misconceptions, for which solutions already exist. The chapter pinpoints key opportunities for future research important to the field as a whole and to the larger benefits it offers. Controlled Vocabulary Terms learning (artificial intelligence); multi-agent systems; programming environments
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002