首页 /研究 /CRISP: Curriculum Inducing Primitive Informed Subgoal Prediction for Hierarchical Reinforcement Learning
MANIPULATION

CRISP: Curriculum Inducing Primitive Informed Subgoal Prediction for Hierarchical Reinforcement Learning

Utsav Singh, Vinay P. Namboodiri

发表年份
2023
访问权限
开放获取

摘要

Hierarchical reinforcement learning (HRL) leverages temporal abstraction to efficiently tackle complex long-horizon tasks. However, HRL often collapses because the continual updates of the low-level primitive make earlier sub-goals issued by the high-level policy obsolete, introducing non-stationarity that destabilizes training. We propose CRISP, a curriculum-driven framework that tackles this instability with three key ingredients: (1) primitive-informed parsing (PIP), which adaptively re-labels a handful of expert demonstrations to always generate reachable subgoals by the current low-level primitive, (2) an inverse-reinforcement-learning regularizer that steers the high-level policy toward the expert-induced subgoal distribution and stabilizes learning, and (3) a unified training loop that leverages these components to boost sample efficiency. Across six sparse-reward robotic navigation and manipulation benchmarks, CRISP improves success rates by more than 40% over strong hierarchical and flat baselines and successfully transfers to real-world tasks, demonstrating the promise of curriculum-based HRL for practical scenarios.

关键词

cs.LG

相关论文

查看 MANIPULATION 分类全部论文