Efficient Exploration in Large State-Action Space Through Structured Action Space for Learning Multirobots Motion Planning
Chaoxu Mu, Zewu Jiang, Jiadong Zhang, Ke Wang, Xin Xu, Jun Yi
- Year
- 2025
- Citations
- 2
Abstract
Multirobots tasks, such as collaborative motion planning or flexible parts assembly, are complex for autonomous robot operations. The robot system for such tasks would need a combination of multigoal motion control and collision avoidance in the shared workspace. For reinforcement learning (RL)-based methods with enhanced flexibility and broader applicability, these methods struggle to effectively handle intricate tasks that require collision-free motion planning and control across long time horizons. Current RL systems cannot achieve efficiently motion planning due to high degrees-of-freedom (DoF), large-scale continuous action space, and tightly coupled workspaces of multirobot system. Additionally, the exploration efficiency for multigoal tasks is typically low, particularly in scenarios where rewards are scarce. To tackle these issues, we propose a new method to decompose the large-scale continuous action space into the combination of discrete action space that enhances exploration and small-scale continuous action space that enhances accuracy of structured hybrid action space (SHAS). To train the model with SHAS efficiently, we develop a task-conditioned hierarchical RL (TC-HRL) framework to train both high-level (HL) policy and low-level (LL) policy in parallel. Comparative experiments from different perspectives demonstrate that our model can learn multirobots manipulation tasks more efficiently and stably, and with considerably less time consumption. Finally, our work is verified by the pick-grasp tasks and real manipulators experiments.
Keywords
Related papers
Artificial intelligence: a modern approach
1995
Self-Organizing Maps
Teuvo Kohonen
1995
Vision meets robotics: The KITTI dataset
Andreas Geiger, Philip Lenz, Christoph Stiller +1 more
2013
Probabilistic robotics
Sebastian Thrun
2002