Exploiting Bias for Cooperative Planning in Multi-Agent Tree Search

Aaron Ma, Michael Ouimet, Jorge Cortés

发表年份: 2020
引用次数: 5

摘要

Graph search over states and actions is a valuable tool for robotic planning and navigation. However, the required computation is sensitive to the size of the state and action spaces, a fact which is further exacerbated in multi-agent planning by the number of agents and the presence of sparse reward signals dependent on the cooperation of agents. To tackle these problems, we introduce an algorithm that is pre-trained in a centralized fashion but implemented on robots in a distributed way at runtime. The centralized portion uses imitation learning to iteratively construct policies that help guide an individual agent`s own runtime search as well as predict other agents' future actions by exploiting previously discovered joint actions. Our algorithm includes a novel method of tree search based on a mixture of the individual and joint action space, which can be interpreted as a cascading effect where agents are biased by exploration of new actions, exploitation of previously profitable ones, and recommendation provided by deep neural nets. Simulations show the efficacy of the proposed method in cooperative scenarios with sparse rewards.

关键词

Computer scienceTree (set theory)MathematicsCombinatorics

Exploiting Bias for Cooperative Planning in Multi-Agent Tree Search

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Fractional Differential Equations

Applied Nonlinear Control