Hierarchical Potential-based Reward Shaping from Task Specifications
Luigi Berducci, Edgar A. Aguilar, Dejan Ničković, Radu Grosu
- 发表年份
- 2021
- 访问权限
- 开放获取
摘要
The automatic synthesis of policies for robotic-control tasks through reinforcement learning relies on a reward signal that simultaneously captures many possibly conflicting requirements. In this paper, we in\-tro\-duce a novel, hierarchical, potential-based reward-shaping approach (HPRS) for defining effective, multivariate rewards for a large family of such control tasks. We formalize a task as a partially-ordered set of safety, target, and comfort requirements, and define an automated methodology to enforce a natural order among requirements and shape the associated reward. Building upon potential-based reward shaping, we show that HPRS preserves policy optimality. Our experimental evaluation demonstrates HPRS's superior ability in capturing the intended behavior, resulting in task-satisfying policies with improved comfort, and converging to optimal behavior faster than other state-of-the-art approaches. We demonstrate the practical usability of HPRS on several robotics applications and the smooth sim2real transition on two autonomous-driving scenarios for F1TENTH race cars.
关键词
相关论文
面向学习与规划的并行可微可达性:具有认证神经动力学与控制器的系统
Keyi Shen, Glen Chou
2026
人工智能增强的智能焊接岛:基础模型革新制造业
Xiwei Wu, Wei Wu, Qiqi Chen 等 9 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于深度强化学习和动态图神经网络的多任务机器人调度代理
Hedi Boukamcha, Anas Neumann, Monia Rekik 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于微调与AAS增强检索的LLM驱动自动化DFA评估
Jiaxin Liu, Xiaofeng Zhou, Suyang Yu 等 8 位作者
Robotics and Computer-Integrated Manufacturing · 2026