On Meta-Reinforcement Learning in task distributions with varying dynamics
Federico Retyk
- Year
- 2021
- Citations
- 2
- Access
- Open access
Abstract
Meta-reinforcement learning has the potential to enable artificial agents to master new skills with improved sample-efficiency by leveraging previous learning experience in tasks that are diverse but share common structure. Our focus is to study the application of such algorithms to task distributions where the function that controls the dynamics of the environment is the main factor of variation. We start by providing an introductory background for related fields, including deep reinforcement learning, variational inference, and meta-learning. Then, we conduct a non-systematic review of the state-of-the-art algorithms for meta-reinforcement learning and perform an empirical investigation of PEARL, a method that combines soft actor-critic with latent task variables. Based on our review, we propose and implement two algorithmic modifications for PEARL: one that aims to improve the meta-training sample complexity by automatically adjusting a critical hyperparameter, and a second one focused on improving the meta-testing asymptotic performance by fine-tuning the policy during adaptation. Using a new multi-task environment suite for simulated robotics continuous control tasks, we compare the original version of PEARL and our proposed modifications, obtaining favourable results. Finally, we ponder our findings and suggest future research directions.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002