首页 /研究 /Beyond Target Networks: Improving Deep Q-learning with Functional Regularization.

LEARNING

Beyond Target Networks: Improving Deep Q-learning with Functional Regularization.

Alexandre Piché, Joseph Marino, Gian Maria Marconi, Christopher Pal, Mohammad Emtiyaz Khan

发表年份: 2021
引用次数: 7

摘要

Target networks are at the core of recent success in Reinforcement Learning. They stabilize the training by using old parameters to estimate the $Q$-values, but this also limits the propagation of newly-encountered rewards which could ultimately slow down the training. In this work, we propose an alternative training method based on functional regularization which does not have this deficiency. Unlike target networks, our method uses up-to-date parameters to estimate the target $Q$-values, thereby speeding up training while maintaining stability. Surprisingly, in some cases, we can show that target networks are a special, restricted type of functional regularizers. Using this approach, we show empirical improvements in sample efficiency and performance across a range of Atari and simulated robotics environments.

关键词

Regularization (linguistics)Artificial intelligenceComputer scienceReinforcement learningStability (learning theory)Machine learningRoboticsSample complexityRange (aeronautics)Sample (material)

Beyond Target Networks: Improving Deep Q-learning with Functional Regularization.

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory