LEARNING
QUOTA: The Quantile Option Architecture for Reinforcement Learning
Shangtong Zhang, Hengshuai Yao
- 发表年份
- 2019
- 引用次数
- 20
- 访问权限
- 开放获取
摘要
In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUOTA provides a new dimension for exploration via making use of both optimism and pessimism of a value distribution. We demonstrate the performance advantage of QUOTA in both challenging video games and physical robot simulators.
关键词
QuantileReinforcement learningArchitectureComputer scienceOptimismValue (mathematics)Artificial intelligenceEconometricsMachine learningEconomics
相关论文
OTHER
📊 26,957 引用
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
PERCEPTION
📊 22,245 引用
Artificial intelligence: a modern approach
1995
OTHER
📊 18,993 引用
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
SWARM
📊 14,853 引用
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002