LEARNING
QUOTA: The Quantile Option Architecture for Reinforcement Learning
Shangtong Zhang, Hengshuai Yao
- Year
- 2019
- Citations
- 20
- Access
- Open access
Abstract
In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUOTA provides a new dimension for exploration via making use of both optimism and pessimism of a value distribution. We demonstrate the performance advantage of QUOTA in both challenging video games and physical robot simulators.
Keywords
QuantileReinforcement learningArchitectureComputer scienceOptimismValue (mathematics)Artificial intelligenceEconometricsMachine learningEconomics
Related papers
OTHER
📊 26,957 cites
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
PERCEPTION
📊 22,245 cites
Artificial intelligence: a modern approach
1995
OTHER
📊 18,993 cites
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
SWARM
📊 14,853 cites
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002