首页 /研究 /Reinforcement Learning and Approximate Dynamic Programming (RLADP)—Foundations, Common Misconceptions, and the Challenges Ahead
LEARNING

Reinforcement Learning and Approximate Dynamic Programming (RLADP)—Foundations, Common Misconceptions, and the Challenges Ahead

Paul J. Werbos

发表年份
2012
引用次数
29

摘要

Many new formulations of reinforcement learning and approximate dynamic programming (RLADP) have appeared in recent years, as it has grown in control applications, control theory, operations research, computer science, robotics, and efforts to understand brain intelligence. This chapter reviews the foundations and challenges common to all these areas, in a unified way but with reference to their variations. It explains the basic tools–Bellman Equation, and Value and Policy Functions. The chapter highlights cases where experience in one area sheds light on obstacles or common misconceptions in another. Many common beliefs about the limits of RLADP are based on such obstacles and misconceptions, for which solutions already exist. The chapter pinpoints key opportunities for future research important to the field as a whole and to the larger benefits it offers. Controlled Vocabulary Terms learning (artificial intelligence); multi-agent systems; programming environments

关键词

Reinforcement learningField (mathematics)Computer scienceArtificial intelligenceKey (lock)Control (management)Dynamic programmingBellman equationVocabularyManagement science

相关论文

查看 LEARNING 分类全部论文