Exploiting Bias for Cooperative Planning in Multi-Agent Tree Search
Aaron Ma, Michael Ouimet, Jorge Cortés
- Year
- 2020
- Citations
- 5
Abstract
Graph search over states and actions is a valuable tool for robotic planning and navigation. However, the required computation is sensitive to the size of the state and action spaces, a fact which is further exacerbated in multi-agent planning by the number of agents and the presence of sparse reward signals dependent on the cooperation of agents. To tackle these problems, we introduce an algorithm that is pre-trained in a centralized fashion but implemented on robots in a distributed way at runtime. The centralized portion uses imitation learning to iteratively construct policies that help guide an individual agent`s own runtime search as well as predict other agents' future actions by exploiting previously discovered joint actions. Our algorithm includes a novel method of tree search based on a mixture of the individual and joint action space, which can be interpreted as a cascading effect where agents are biased by exploration of new actions, exploitation of previously profitable ones, and recommendation provided by deep neural nets. Simulations show the efficacy of the proposed method in cooperative scenarios with sparse rewards.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991