首页 /研究 /Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns

LEARNING

Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns

Yong Liu, Yujing Hu, Yang Gao, Yingfeng Chen, Changjie Fan

发表年份: 2019
引用次数: 34
访问权限: 开放获取

摘要

Many real-world problems, such as robot control and soccer game, are naturally modeled as sparse-interaction multi-agent systems. Reutilizing single-agent knowledge in multi-agent systems with sparse interactions can greatly accelerate the multi-agent learning process. Previous works rely on bisimulation metric to define Markov decision process (MDP) similarity for controlling knowledge transfer. However, bisimulation metric is costly to compute and is not suitable for high-dimensional state space problems. In this work, we propose more scalable transfer learning methods based on a novel MDP similarity concept. We start by defining the MDP similarity based on the N-step return (NSR) values of an MDP. Then, we propose two knowledge transfer methods based on deep neural networks called direct value function transfer and NSR-based value function transfer. We conduct experiments in image-based grid world, multi-agent particle environment (MPE) and Ms. Pac-Man game. The results indicate that the proposed methods can significantly accelerate multi-agent reinforcement learning and meanwhile get better asymptotic performance.

关键词

Reinforcement learningComputer scienceMarkov decision processTransfer of learningBellman equationArtificial intelligenceQ-learningSimilarity (geometry)Metric (unit)Artificial neural network

Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory