Home /Research /Reinforcement Learning and Approximate Dynamic Programming (RLADP)—Foundations, Common Misconceptions, and the Challenges Ahead
LEARNING

Reinforcement Learning and Approximate Dynamic Programming (RLADP)—Foundations, Common Misconceptions, and the Challenges Ahead

Paul J. Werbos

Year
2012
Citations
29

Abstract

Many new formulations of reinforcement learning and approximate dynamic programming (RLADP) have appeared in recent years, as it has grown in control applications, control theory, operations research, computer science, robotics, and efforts to understand brain intelligence. This chapter reviews the foundations and challenges common to all these areas, in a unified way but with reference to their variations. It explains the basic tools–Bellman Equation, and Value and Policy Functions. The chapter highlights cases where experience in one area sheds light on obstacles or common misconceptions in another. Many common beliefs about the limits of RLADP are based on such obstacles and misconceptions, for which solutions already exist. The chapter pinpoints key opportunities for future research important to the field as a whole and to the larger benefits it offers. Controlled Vocabulary Terms learning (artificial intelligence); multi-agent systems; programming environments

Keywords

Reinforcement learningField (mathematics)Computer scienceArtificial intelligenceKey (lock)Control (management)Dynamic programmingBellman equationVocabularyManagement science

Related papers

Browse all LEARNING papers