Approximate Discounted Dynamic Programming Is Unreliable

Matthew A. McDonald, Philip Hingston

Year: 1994
Citations: 6

Abstract

Popular reinforcement learning methods that employ generalising function approximators perform poorly in many domains. We analyse effects of approximation error in domains with sparse rewards, revealing the extent of scaling difficulties. Empirical evidence is presented that suggests when problems are likely to occur and explains some of the widely differing results reported in the literature. Keywords Reinforcement learning, dynamic programming, function approximation, induction, problem solving CR categories I.2.6, I.2.8 * The Robotics and Vision Research Group acknowledges the support received from Digital through their External Research Programme. Department of Computer Science Approximate Discounted Dynamic Programming Is Unreliable McDonald, Hingston -- Page 1 1 Introduction Most domains studied in AI are too large to be searched exhaustively. It is widely believed that reinforcement learning methods must be combined with generalising function approximators in order to sc...

Keywords

Computer scienceMathematicsMathematical optimization

Approximate Discounted Dynamic Programming Is Unreliable

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Fractional Differential Equations

Applied Nonlinear Control