Home /Research /Hierarchical Potential-based Reward Shaping from Task Specifications

LEARNING

Hierarchical Potential-based Reward Shaping from Task Specifications

Luigi Berducci, Edgar A. Aguilar, Dejan Ničković, Radu Grosu

Year: 2021
Citations: 4
Access: Open access

Abstract

The automatic synthesis of policies for robotic-control tasks through reinforcement learning relies on a reward signal that simultaneously captures many possibly conflicting requirements. In this paper, we in\-tro\-duce a novel, hierarchical, potential-based reward-shaping approach (HPRS) for defining effective, multivariate rewards for a large family of such control tasks. We formalize a task as a partially-ordered set of safety, target, and comfort requirements, and define an automated methodology to enforce a natural order among requirements and shape the associated reward. Building upon potential-based reward shaping, we show that HPRS preserves policy optimality. Our experimental evaluation demonstrates HPRS's superior ability in capturing the intended behavior, resulting in task-satisfying policies with improved comfort, and converging to optimal behavior faster than other state-of-the-art approaches. We demonstrate the practical usability of HPRS on several robotics applications and the smooth sim2real transition on two autonomous-driving scenarios for F1TENTH race cars.

Keywords

Reinforcement learningTask (project management)Set (abstract data type)Computer scienceUsabilityArtificial intelligenceControl (management)Human–computer interactionRoboticsRobot

Hierarchical Potential-based Reward Shaping from Task Specifications

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory