Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL
Rui Yang, Yiming Lu, Wenzhe Li, Hao Sun, Meng Fang, Yali Du, Xiu Li, Lei Han, Chongjie Zhang
- 发表年份
- 2022
- 访问权限
- 开放获取
摘要
Solving goal-conditioned tasks with sparse rewards using self-supervised learning is promising because of its simplicity and stability over current reinforcement learning (RL) algorithms. A recent work, called Goal-Conditioned Supervised Learning (GCSL), provides a new learning framework by iteratively relabeling and imitating self-generated experiences. In this paper, we revisit the theoretical property of GCSL -- optimizing a lower bound of the goal reaching objective, and extend GCSL as a novel offline goal-conditioned RL algorithm. The proposed method is named Weighted GCSL (WGCSL), in which we introduce an advanced compound weight consisting of three parts (1) discounted weight for goal relabeling, (2) goal-conditioned exponential advantage weight, and (3) best-advantage weight. Theoretically, WGCSL is proved to optimize an equivalent lower bound of the goal-conditioned RL objective and generates monotonically improved policies via an iterated scheme. The monotonic property holds for any behavior policies, and therefore WGCSL can be applied to both online and offline settings. To evaluate algorithms in the offline goal-conditioned RL setting, we provide a benchmark including a range of point and simulated robot domains. Experiments in the introduced benchmark demonstrate that WGCSL can consistently outperform GCSL and existing state-of-the-art offline methods in the fully offline goal-conditioned setting.
关键词
相关论文
面向学习与规划的并行可微可达性:具有认证神经动力学与控制器的系统
Keyi Shen, Glen Chou
2026
人工智能增强的智能焊接岛:基础模型革新制造业
Xiwei Wu, Wei Wu, Qiqi Chen 等 9 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于深度强化学习和动态图神经网络的多任务机器人调度代理
Hedi Boukamcha, Anas Neumann, Monia Rekik 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于微调与AAS增强检索的LLM驱动自动化DFA评估
Jiaxin Liu, Xiaofeng Zhou, Suyang Yu 等 8 位作者
Robotics and Computer-Integrated Manufacturing · 2026