LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy
Hsin-Jung Yang, Zhanhong Jiang, Prajwal Koirala, Qisai Liu, Cody Fleming, Soumik Sarkar
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Offline safe reinforcement learning (RL) is increasingly important for cyber-physical systems (CPS), where safety violations during training are unacceptable and only pre-collected data are available. Existing offline safe RL methods typically balance reward-safety tradeoffs through constraint relaxation or joint optimization, but they often lack structural mechanisms to prevent safety drift. We propose LexiSafe, a lexicographic offline RL framework designed to preserve safety-aligned behavior. We first develop LexiSafe-SC, a single-cost formulation for standard offline safe RL, and derive safety-violation and performance-suboptimality bounds that together yield sample-complexity guarantees. We then extend the framework to hierarchical safety requirements with LexiSafe-MC, which supports multiple safety costs and admits its own sample-complexity analysis. Empirically, LexiSafe demonstrates reduced safety violations and improved task performance compared to constrained offline baselines. By unifying lexicographic prioritization with structural bias, LexiSafe offers a practical and theoretically grounded approach for safety-critical CPS decision-making.
关键词
相关论文
面向学习与规划的并行可微可达性:具有认证神经动力学与控制器的系统
Keyi Shen, Glen Chou
2026
人工智能增强的智能焊接岛:基础模型革新制造业
Xiwei Wu, Wei Wu, Qiqi Chen 等 9 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于深度强化学习和动态图神经网络的多任务机器人调度代理
Hedi Boukamcha, Anas Neumann, Monia Rekik 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于微调与AAS增强检索的LLM驱动自动化DFA评估
Jiaxin Liu, Xiaofeng Zhou, Suyang Yu 等 8 位作者
Robotics and Computer-Integrated Manufacturing · 2026