USHER: Unbiased Sampling for Hindsight Experience Replay
Liam Schramm, Yunfu Deng, Edgar Granados, Abdeslam Boularias
- 发表年份
- 2022
- 访问权限
- 开放获取
摘要
Dealing with sparse rewards is a long-standing challenge in reinforcement learning (RL). Hindsight Experience Replay (HER) addresses this problem by reusing failed trajectories for one goal as successful trajectories for another. This allows for both a minimum density of reward and for generalization across multiple goals. However, this strategy is known to result in a biased value function, as the update rule underestimates the likelihood of bad outcomes in a stochastic environment. We propose an asymptotically unbiased importance-sampling-based algorithm to address this problem without sacrificing performance on deterministic environments. We show its effectiveness on a range of robotic systems, including challenging high dimensional stochastic environments.
关键词
相关论文
面向学习与规划的并行可微可达性:具有认证神经动力学与控制器的系统
Keyi Shen, Glen Chou
2026
人工智能增强的智能焊接岛:基础模型革新制造业
Xiwei Wu, Wei Wu, Qiqi Chen 等 9 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于深度强化学习和动态图神经网络的多任务机器人调度代理
Hedi Boukamcha, Anas Neumann, Monia Rekik 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于微调与AAS增强检索的LLM驱动自动化DFA评估
Jiaxin Liu, Xiaofeng Zhou, Suyang Yu 等 8 位作者
Robotics and Computer-Integrated Manufacturing · 2026