Grounding Vision and Language to 3D Masks for Long-Horizon Box Rearrangement
Ashish Malik, Caleb Lowe, Aayam Shrestha, Stefan Lee, Fuxin Li, Alan Fern
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
We study long-horizon planning in 3D environments from under-specified natural-language goals using only visual observations, focusing on multi-step 3D box rearrangement tasks. Existing approaches typically rely on symbolic planners with brittle relational grounding of states and goals, or on direct action-sequence generation from 2D vision-language models (VLMs). Both approaches struggle with reasoning over many objects, rich 3D geometry, and implicit semantic constraints. Recent advances in 3D VLMs demonstrate strong grounding of natural-language referents to 3D segmentation masks, suggesting the potential for more general planning capabilities. We extend existing 3D grounding models and propose Reactive Action Mask Planner (RAMP-3D), which formulates long-horizon planning as sequential reactive prediction of paired 3D masks: a "which-object" mask indicating what to pick and a "which-target-region" mask specifying where to place it. The resulting system processes RGB-D observations and natural-language task specifications to reactively generate multi-step pick-and-place actions for 3D box rearrangement. We conduct experiments across 11 task variants in warehouse-style environments with 1-30 boxes and diverse natural-language constraints. RAMP-3D achieves 79.5% success rate on long-horizon rearrangement tasks and significantly outperforms 2D VLM-based baselines, establishing mask-based reactive policies as a promising alternative to symbolic pipelines for long-horizon planning.
关键词
相关论文
一种面向线弧增材制造的电动汽车结构可制造性拓扑优化的双环框架
Qiang Cui, Chuan Yu, Daoqian Yang 等 5 位作者
Robotics and Computer-Integrated Manufacturing · 2026
几何数字孪生:一种用于航空发动机装配精度预测的数字智能模型
Ke Shang, Xin Jin, Teli Xu 等 7 位作者
Robotics and Computer-Integrated Manufacturing · 2026
通过人工智能驱动的机器人技术革新产业
Aryan Chaudhary
Recent Advances in Computer Science and Communications · 2026
新型大口径偏置馈电可展开天线设计与动态性能预测
Chuang Shi, Tianming Liu, Ning Xue 等 9 位作者
Aerospace Science and Technology · 2026