Home /Research /Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation

MANIPULATION

Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation

Kun Wu, Yichen Zhu, Jinming Li, Junjie Wen, Ning Liu, Zhiyuan Xu, Jian Tang

Year: 2025
Citations: 2

Abstract

Learning visuomotor policy for multi-task robotic manipulation has been a long-standing challenge for the robotics community. The difficulty lies in the diversity of action space: typically, a goal can be accomplished in multiple ways, resulting in a multimodal action distribution for a single task. The complexity of action distribution escalates as the number of tasks increases. In this work, we propose Discrete Policy, a robot learning method for training universal agents capable of multi-task manipulation skills. Discrete Policy employs vector quantization to map action sequences into a discrete latent space, facilitating the learning of task-specific codes. These codes are then reconstructed into the action space conditioned on observations and language instruction. We evaluate our method on both simulation and multiple real-world embodiments, including both single-arm and bimanual robot settings. We demonstrate that our proposed Discrete Policy outperforms a well-established Diffusion Policy baseline and many state-of-the-art approaches, including ACT, Octo, and OpenVLA. For example, in a real-world multi-task training setting with five tasks, Discrete Policy achieves an average success rate that is <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathbf{2 6 \%}$</tex> higher than Diffusion Policy and 15% higher than OpenVLA. As the number of tasks increases to 12, the performance gap between Discrete Policy and Diffusion Policy widens to 32.5 %, further showcasing the advantages of our approach. Our work empirically demonstrates that learning multi-task policies within the latent space is a vital step toward achieving general-purpose agents. Our project is at https://discretepolicy.github.io.

Keywords

Computer scienceTask (project management)Action (physics)Space (punctuation)Artificial intelligenceHuman–computer interactionEngineeringPhysicsSystems engineering

Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory