History-Enhanced 3D Scene Graph Reasoning From RGB-D Sequences
Mingtao Feng, Chan Kit Yan, Zijie Wu, Weisheng Dong, Yaonan Wang, Ajmal Mian
- Year
- 2025
- Citations
- 28
Abstract
3D scene graph has emerged as a powerful high-level representation of the environment, and is considered a prerequisite for long-term autonomous robotic operations. However, building rich representations from RGB-D sequences remains a challenging problem. Existing methods ignore the semantic gap between linguistic and geometric feature spaces or neglect the importance of historical context in incrementally captured data. This limits the learning of visual-textual correspondence and the capability of relationship prediction. To address these problems, we propose a history-enhanced 3D scene graph reasoning framework that incrementally builds a consistent 3D semantic scene graph from an RGB-D image sequence. Specifically, we first introduce a cross-domain unified feature representation module to describe the object instances and their relationships distinctly. Next, we build a one-hot candidate matrix-enabled recurrent mechanism to reason the 3D scene graph, combining the perceived global and local history information. Finally, we design history-aware supervised semantics contrastive learning to optimize the scene-specific global history features. Extensive experiments on the 3DSSG dataset show the effectiveness of the proposed method in this challenging task, outperforming state-of-the-art approaches. Our code will be available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/cbyan1003/HE-3DSGR</uri>.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Fractional Differential Equations
Igor Podlubný
2025
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991