Improving Robotic Manipulation Robustness via NICE Scene Surgery
Sajjad Pakdamansavoji, Mozhgan Pourkeshavarz, Adam Sigal, Zhiyuan Li, Rui Heng Yang, Amir Rasouli
- 发表年份
- 2025
- 访问权限
- 开放获取
摘要
Learning robust visuomotor policies for robotic manipulation remains a challenge in real-world settings, where visual distractors can significantly degrade performance and safety. In this work, we propose an effective and scalable framework, Naturalistic Inpainting for Context Enhancement (NICE). Our method minimizes out-of-distribution (OOD) gap in imitation learning by increasing visual diversity through construction of new experiences using existing demonstrations. By utilizing image generative frameworks and large language models, NICE performs three editing operations, object replacement, restyling, and removal of distracting (non-target) objects. These changes preserve spatial relationships without obstructing target objects and maintain action-label consistency. Unlike previous approaches, NICE requires no additional robot data collection, simulator access, or custom model training, making it readily applicable to existing robotic datasets. Using real-world scenes, we showcase the capability of our framework in producing photo-realistic scene enhancement. For downstream tasks, we use NICE data to finetune a vision-language model (VLM) for spatial affordance prediction and a vision-language-action (VLA) policy for object manipulation. Our evaluations show that NICE successfully minimizes OOD gaps, resulting in over 20% improvement in accuracy for affordance prediction in highly cluttered scenes. For manipulation tasks, success rate increases on average by 11% when testing in environments populated with distractors in different quantities. Furthermore, we show that our method improves visual robustness, lowering target confusion by 6%, and enhances safety by reducing collision rate by 7%.
关键词
相关论文
机器人技术在整形外科中的应用
Vijay Kumar, Sandhya Pandey
Clinical Journal of Plastic & Reconstructive Surgery · 2026
SurfSurg6D:面向无纹理手术器械的几何一致密集对应位姿估计
Daiyun Shen, Shuojue Yang, Chang Han Low 等 7 位作者
2026
EndoGSim:基于MLLM引导的高斯泼溅的物理感知4D动态内窥镜场景模拟
Changjing Liu, Yiming Huang, Long Bai 等 5 位作者
2026
腹膜后机器人辅助肾输尿管切除术:技术描述与单中心经验
Kawashima A, Ishizuya Y, Yamamoto Y 等 12 位作者
Asian journal of endoscopic surgery · 2026