Home /Research /Prioritized Sampling with Intrinsic Motivation in Multi-Task Reinforcement Learning
LEARNING

Prioritized Sampling with Intrinsic Motivation in Multi-Task Reinforcement Learning

Carlo D’Eramo, Georgia Chalvatzaki

Year
2022
Citations
3

Abstract

Deep Reinforcement Learning (RL) promises to lead the next advances towards the development of coveted future intelligent agents. However, the unprecedented representational power of deep function approximators, e.g. deep neural networks, comes at the cost of demanding a huge amount of experience, making deep RL impractical for applications requiring interactions with the real world. We study the problem of making use of samples in deep RL more efficiently, exploiting the desirable properties of knowledge generalization resulting from learning multiple tasks together. The outcome of our work is the coupling of multi-task RL algorithms with a task-sampling policy based on the well-known intrinsic motivation paradigm. In particular, we leverage on the notion of TD-error of Bellman updates, as an effective measure of learning progress, to prioritize sampling from the tasks contributing the most to the learning of the agent. This sampling strategy speeds up the learning of tasks for which the agent is showing progress, and postpones the learning of the remaining ones, resulting in an optimized collection of samples. Our method is supported by experimental evaluations on well-known RL control tasks, for which our approach shows superior sample-efficiency and performance compared to representative baselines. We eventually evaluate our approach on simulated control tasks based on Quanser robotics systems, confirming the advantages over the baselines also in more realistic applications.

Keywords

Reinforcement learningComputer scienceTask (project management)Intrinsic motivationSampling (signal processing)ReinforcementArtificial intelligenceMachine learningPsychologySocial psychology

Related papers

Browse all LEARNING papers