首页 /研究 /Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns
LEARNING

Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns

Yong Liu, Yujing Hu, Yang Gao, Yingfeng Chen, Changjie Fan

发表年份
2019
引用次数
34
访问权限
开放获取

摘要

Many real-world problems, such as robot control and soccer game, are naturally modeled as sparse-interaction multi-agent systems. Reutilizing single-agent knowledge in multi-agent systems with sparse interactions can greatly accelerate the multi-agent learning process. Previous works rely on bisimulation metric to define Markov decision process (MDP) similarity for controlling knowledge transfer. However, bisimulation metric is costly to compute and is not suitable for high-dimensional state space problems. In this work, we propose more scalable transfer learning methods based on a novel MDP similarity concept. We start by defining the MDP similarity based on the N-step return (NSR) values of an MDP. Then, we propose two knowledge transfer methods based on deep neural networks called direct value function transfer and NSR-based value function transfer. We conduct experiments in image-based grid world, multi-agent particle environment (MPE) and Ms. Pac-Man game. The results indicate that the proposed methods can significantly accelerate multi-agent reinforcement learning and meanwhile get better asymptotic performance.

关键词

Reinforcement learningComputer scienceMarkov decision processTransfer of learningBellman equationArtificial intelligenceQ-learningSimilarity (geometry)Metric (unit)Artificial neural network

相关论文

查看 LEARNING 分类全部论文