Don't do it: Safer Reinforcement Learning With Rule-based Guidance

Ekaterina Nikonova, Cheng Xue, Jochen Renz

发表年份: 2022
访问权限: 开放获取

摘要

During training, reinforcement learning systems interact with the world without considering the safety of their actions. When deployed into the real world, such systems can be dangerous and cause harm to their surroundings. Often, dangerous situations can be mitigated by defining a set of rules that the system should not violate under any conditions. For example, in robot navigation, one safety rule would be to avoid colliding with surrounding objects and people. In this work, we define safety rules in terms of the relationships between the agent and objects and use them to prevent reinforcement learning systems from performing potentially harmful actions. We propose a new safe epsilon-greedy algorithm that uses safety rules to override agents' actions if they are considered to be unsafe. In our experiments, we show that a safe epsilon-greedy policy significantly increases the safety of the agent during training, improves the learning efficiency resulting in much faster convergence, and achieves better performance than the base model.

关键词

cs.AI

Don't do it: Safer Reinforcement Learning With Rule-based Guidance

摘要

关键词

相关论文

面向学习与规划的并行可微可达性：具有认证神经动力学与控制器的系统

人工智能增强的智能焊接岛：基础模型革新制造业

基于深度强化学习和动态图神经网络的多任务机器人调度代理

基于微调与AAS增强检索的LLM驱动自动化DFA评估