首页 /研究 /TOSQ: Transparent Object Segmentation via Query-Based Dictionary Lookup with Transformers
MANIPULATION

TOSQ: Transparent Object Segmentation via Query-Based Dictionary Lookup with Transformers

Bin Ma, Ming Ma, Ruiguang Li, Jiawei Zheng, Deping Li

发表年份
2025
引用次数
1
访问权限
开放获取

摘要

Sensing transparent objects has many applications in human daily life, including robot navigation and grasping. However, this task presents significant challenges due to the unpredictable nature of scenes that extend beyond/behind transparent objects, particularly the lack of fixed visual patterns and strong background interference. This paper aims to solve the transparent object segmentation problem by leveraging the intrinsic global modeling capabilities of transformer architectures. We design a Query Parsing Module (QPM) that innovatively formulates segmentation as a dictionary lookup problem, differing fundamentally from conventional pixel-wise mechanisms, e.g., via attention-based prototype matching, and a set of learnable class prototypes as query inputs. Based on QPM, we propose a high-performance transformer-based end-to-end segmentation model, Transparent Object Segmentation through Query (TOSQ). TOSQ's encoder is based on the Segformer's backbone, and its decoder consists of a series of QPM modules, which progressively refine segmentation masks by the proposed QPMs. TOSQ achieves state-of-the-art performance on the Trans10K-V2 dataset (76.63% mIoU, 95.34% Acc), with particularly significant gains in challenging categories like windows (+23.59%) and glass doors (+11.22%), demonstrating its superior capability in transparent object segmentation.

关键词

Computer scienceSegmentationTransformerArtificial intelligenceParsingEncoderComputer visionObject (grammar)Pattern recognition (psychology)Voltage

相关论文

查看 MANIPULATION 分类全部论文