PNS: Population-Guided Novelty Search for Reinforcement Learning in Hard Exploration Environments
Qihao Liu, Yujia Wang, Xiaofeng Liu
- 发表年份
- 2018
- 访问权限
- 开放获取
摘要
Reinforcement Learning (RL) has made remarkable achievements, but it still suffers from inadequate exploration strategies, sparse reward signals, and deceptive reward functions. To alleviate these problems, a Population-guided Novelty Search (PNS) parallel learning method is proposed in this paper. In PNS, the population is divided into multiple sub-populations, each of which has one chief agent and several exploring agents. The chief agent evaluates the policies learned by exploring agents and shares the optimal policy with all sub-populations. The exploring agents learn their policies in collaboration with the guidance of the optimal policy and, simultaneously, upload their policies to the chief agent. To balance exploration and exploitation, the Novelty Search (NS) is employed in every chief agent to encourage policies with high novelty while maximizing per-episode performance. We apply PNS to the twin delayed deep deterministic (TD3) policy gradient algorithm. The effectiveness of PNS to promote exploration and improve performance in continuous control domains is demonstrated in the experimental section. Notably, PNS-TD3 achieves rewards that far exceed the SOTA methods in environments with sparse or delayed reward signals. We also demonstrate that PNS enables robotic agents to learn control policies directly from pixels for sparse-reward manipulation in both simulated and real-world settings.
关键词
相关论文
面向大型复杂构件的移动机器人辅助磨削技术综述
Yusen Li, Ziwei Wang, Xiangye Zhu 等 12 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于物理信息与机器学习的五轴铣削TC4钛合金刀具磨损融合预测模型
Shaoqing Qin, Lida Zhu, Yanpeng Hao 等 10 位作者
Robotics and Computer-Integrated Manufacturing · 2026
通过新型压电主动阻尼刀柄提升机器人铣削质量
Bo Li, Yuanbo Zhao, Huijie Xiao 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
一种利用磁致非线性宽带多向被动减振器抑制机器人铣削低频颤振的新方法
Hao Li, Yuhui Yu, Rui Fu 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026