Policy gradient assisted MAP-Elites

Olle Nilsson, Antoine Cully

发表年份: 2021
引用次数: 62

摘要

Quality-Diversity optimization algorithms such as MAP-Elites, aim to generate collections of both diverse and high-performing solutions to an optimization problem. MAP-Elites has shown promising results in a variety of applications. In particular in evolutionary robotics tasks targeting the generation of behavioral repertoires that highlight the versatility of robots. However, for most robotics applications MAP-Elites is limited to using simple open-loop or low-dimensional controllers. Here we present Policy Gradient Assisted MAP-Elites (PGA-MAP-Elites), a novel algorithm that enables MAP-Elites to efficiently evolve large neural network controllers by introducing a gradient-based variation operator inspired by Deep Reinforcement Learning. This operator leverages gradient estimates obtained from a critic neural network to rapidly find higher-performing solutions and is paired with a traditional genetic variation to maintain a divergent search behavior. The synergy of these operators makes PGA-MAP-Elites an efficient yet powerful algorithm for finding diverse and high-performing behaviors. We evaluate our method on four different tasks for building behavioral repertoires that use deep neural network controllers. The results show that PGA-MAP-Elites significantly improves the quality of the generated repertoires compared to existing methods.

关键词

RoboticsArtificial intelligenceComputer scienceArtificial neural networkEvolutionary roboticsReinforcement learningOperator (biology)RobotVariation (astronomy)Quality (philosophy)

Policy gradient assisted MAP-Elites

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory