Macro-Actions in Reinforcement Learning: An Empirical Analysis

Amy McGovern, Richard S. Sutton

发表年份: 1998
引用次数: 43
访问权限: 开放获取

摘要

Several researchers have proposed reinforcement learning methods that obtain ad-vantages in learning by using temporally extended actions, or macro-actions, but none has carefully analyzed what these advantages are. In this paper, we separate and an-alyze two advantages of using macro-actions in reinforcement learning: the effect on exploratory behavior, independent of learning, and the effect on the speed with which the learning process propagates accurate value information. We empirically measure the separate contributions of these two effects in gridworld and simulated robotic envi-ronments. In these environments, both effects were significant, but the effect of value propagation was larger. We also compare the accelerations of value propagation due to macro-actions and eligibility traces in the gridworld environment. Although eligi-bility traces increased the rate of convergence to the optimal value function compared to learning with macro-actions but without eligibility traces, eligibility traces did not permit the optimal policy to be learned as quickly as it was using macro-actions. 1

关键词

MacroReinforcement learningComputer scienceProcess (computing)Artificial intelligenceMachine learningValue (mathematics)Function (biology)Bellman equationReinforcement

Macro-Actions in Reinforcement Learning: An Empirical Analysis

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory