Reward Engineering for Object Pick and Place Training
Raghav Nagpal, Achyuthan Unni Krishnan, Hanshen Yu
- 发表年份
- 2020
- 访问权限
- 开放获取
摘要
Robotic grasping is a crucial area of research as it can result in the acceleration of the automation of several Industries utilizing robots ranging from manufacturing to healthcare. Reinforcement learning is the field of study where an agent learns a policy to execute an action by exploring and exploiting rewards from an environment. Reinforcement learning can thus be used by the agent to learn how to execute a certain task, in our case grasping an object. We have used the Pick and Place environment provided by OpenAI's Gym to engineer rewards. Hindsight Experience Replay (HER) has shown promising results with problems having a sparse reward. In the default configuration of the OpenAI baseline and environment the reward function is calculated using the distance between the target location and the robot end-effector. By weighting the cost based on the distance of the end-effector from the goal in the x,y and z-axes we were able to almost halve the learning time compared to the baselines provided by OpenAI, an intuitive strategy that further reduced learning time. In this project, we were also able to introduce certain user desired trajectories in the learnt policies (city-block / Manhattan trajectories). This helps us understand that by engineering the rewards we can tune the agent to learn policies in a certain way even if it might not be the most optimal but is the desired manner.
关键词
相关论文
面向大型复杂构件的移动机器人辅助磨削技术综述
Yusen Li, Ziwei Wang, Xiangye Zhu 等 12 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于物理信息与机器学习的五轴铣削TC4钛合金刀具磨损融合预测模型
Shaoqing Qin, Lida Zhu, Yanpeng Hao 等 10 位作者
Robotics and Computer-Integrated Manufacturing · 2026
通过新型压电主动阻尼刀柄提升机器人铣削质量
Bo Li, Yuanbo Zhao, Huijie Xiao 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
一种利用磁致非线性宽带多向被动减振器抑制机器人铣削低频颤振的新方法
Hao Li, Yuhui Yu, Rui Fu 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026