首页 /研究 /Reinforcement Learning and Approximate Dynamic Programming (RLADP)—Foundations, Common Misconceptions, and the Challenges Ahead

LEARNING

Reinforcement Learning and Approximate Dynamic Programming (RLADP)—Foundations, Common Misconceptions, and the Challenges Ahead

Paul J. Werbos

发表年份: 2012
引用次数: 29

摘要

Many new formulations of reinforcement learning and approximate dynamic programming (RLADP) have appeared in recent years, as it has grown in control applications, control theory, operations research, computer science, robotics, and efforts to understand brain intelligence. This chapter reviews the foundations and challenges common to all these areas, in a unified way but with reference to their variations. It explains the basic tools–Bellman Equation, and Value and Policy Functions. The chapter highlights cases where experience in one area sheds light on obstacles or common misconceptions in another. Many common beliefs about the limits of RLADP are based on such obstacles and misconceptions, for which solutions already exist. The chapter pinpoints key opportunities for future research important to the field as a whole and to the larger benefits it offers. Controlled Vocabulary Terms learning (artificial intelligence); multi-agent systems; programming environments

关键词

Reinforcement learningField (mathematics)Computer scienceArtificial intelligenceKey (lock)Control (management)Dynamic programmingBellman equationVocabularyManagement science

Reinforcement Learning and Approximate Dynamic Programming (RLADP)—Foundations, Common Misconceptions, and the Challenges Ahead

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory