首页 /研究 /Unifying Hamilton-Jacobi Reachability and Reinforcement Learning
LEARNING

Unifying Hamilton-Jacobi Reachability and Reinforcement Learning

Prashant Solanki, Isabelle El-Hajj, Jasper van Beers, Erik-Jan van Kampen, Coen de Visser

发表年份
2026
访问权限
开放获取

摘要

We unify Hamilton-Jacobi (HJ) reachability and Reinforcement Learning (RL) through a proposed running cost formulation. We prove that the resultant travel-cost value function is the unique bounded viscosity solution of a time-dependent Hamilton-Jacobi Bellman (HJB) Partial Differential Equation (PDE) with zero terminal data, whose negative sublevel set equals the strict backward-reachable tube. Using a forward reparameterization and a contraction inducing Bellman update, we show that fixed points of small-step RL value iteration converge to the viscosity solution of the forward discounted HJB. Experiments on a classical benchmark validate this connection by demonstrating convergence of learned value functions toward semi-Lagrangian HJB solutions and by quantifying approximation error across the state space. These results empirically support the theoretical analysis, showing that the proposed framework preserves reachability-based safety semantics while remaining compatible with deep RL implementations.

关键词

eess.SY

相关论文

查看 LEARNING 分类全部论文