TOSQ: Transparent Object Segmentation via Query-Based Dictionary Lookup with Transformers
Bin Ma, Ming Ma, Ruiguang Li, Jiawei Zheng, Deping Li
- 发表年份
- 2025
- 引用次数
- 1
- 访问权限
- 开放获取
摘要
Sensing transparent objects has many applications in human daily life, including robot navigation and grasping. However, this task presents significant challenges due to the unpredictable nature of scenes that extend beyond/behind transparent objects, particularly the lack of fixed visual patterns and strong background interference. This paper aims to solve the transparent object segmentation problem by leveraging the intrinsic global modeling capabilities of transformer architectures. We design a Query Parsing Module (QPM) that innovatively formulates segmentation as a dictionary lookup problem, differing fundamentally from conventional pixel-wise mechanisms, e.g., via attention-based prototype matching, and a set of learnable class prototypes as query inputs. Based on QPM, we propose a high-performance transformer-based end-to-end segmentation model, Transparent Object Segmentation through Query (TOSQ). TOSQ's encoder is based on the Segformer's backbone, and its decoder consists of a series of QPM modules, which progressively refine segmentation masks by the proposed QPMs. TOSQ achieves state-of-the-art performance on the Trans10K-V2 dataset (76.63% mIoU, 95.34% Acc), with particularly significant gains in challenging categories like windows (+23.59%) and glass doors (+11.22%), demonstrating its superior capability in transparent object segmentation.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002