首页 /研究 /On Meta-Reinforcement Learning in task distributions with varying dynamics
LEARNING

On Meta-Reinforcement Learning in task distributions with varying dynamics

Federico Retyk

发表年份
2021
引用次数
2
访问权限
开放获取

摘要

Meta-reinforcement learning has the potential to enable artificial agents to master new skills with improved sample-efficiency by leveraging previous learning experience in tasks that are diverse but share common structure. Our focus is to study the application of such algorithms to task distributions where the function that controls the dynamics of the environment is the main factor of variation. We start by providing an introductory background for related fields, including deep reinforcement learning, variational inference, and meta-learning. Then, we conduct a non-systematic review of the state-of-the-art algorithms for meta-reinforcement learning and perform an empirical investigation of PEARL, a method that combines soft actor-critic with latent task variables. Based on our review, we propose and implement two algorithmic modifications for PEARL: one that aims to improve the meta-training sample complexity by automatically adjusting a critical hyperparameter, and a second one focused on improving the meta-testing asymptotic performance by fine-tuning the policy during adaptation. Using a new multi-task environment suite for simulated robotics continuous control tasks, we compare the original version of PEARL and our proposed modifications, obtaining favourable results. Finally, we ponder our findings and suggest future research directions.

关键词

Reinforcement learningDynamics (music)Task (project management)ReinforcementComputer scienceArtificial intelligenceCognitive psychologyPsychologyEngineeringSocial psychology

相关论文

查看 LEARNING 分类全部论文