A Spatiotemporal Approach to Tri-Perspective Representation for 3D Semantic Occupancy Prediction
Sathira Silva, Savindu Bhashitha Wannigama, Gihan Jayatilaka, Muhammad Haris Khan, Roshan Ragel
- 发表年份
- 2024
- 访问权限
- 开放获取
摘要
Holistic understanding and reasoning in 3D scenes are crucial for the success of autonomous driving systems. The evolution of 3D semantic occupancy prediction as a pretraining task for autonomous driving and robotic applications captures finer 3D details compared to traditional 3D detection methods. Vision-based 3D semantic occupancy prediction is increasingly overlooked in favor of LiDAR-based approaches, which have shown superior performance in recent years. However, we present compelling evidence that there is still potential for enhancing vision-based methods. Existing approaches predominantly focus on spatial cues such as tri-perspective view (TPV) embeddings, often overlooking temporal cues. This study introduces S2TPVFormer, a spatiotemporal transformer architecture designed to predict temporally coherent 3D semantic occupancy. By introducing temporal cues through a novel Temporal Cross-View Hybrid Attention mechanism (TCVHA), we generate Spatiotemporal TPV (S2TPV) embeddings that enhance the prior process. Experimental evaluations on the nuScenes dataset demonstrate a significant +4.1% of absolute gain in mean Intersection over Union (mIoU) for 3D semantic occupancy compared to baseline TPVFormer, validating the effectiveness of S2TPVFormer in advancing 3D scene perception.
关键词
相关论文
如何缓解越野环境中语义分割的分布偏移
Ji-Hoon Hwang, Daeyoung Kim, Hyung-Suk Yoon 等 5 位作者
2026
基于原型模糊推理与证据融合的不确定性引导工业机器人可进化识别框架
Yanrun Zhou, Zihao Lei, Guangrui Wen 等 7 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于点云配准的非破坏性高分辨率涂层厚度三维扫描测量
Simon Duenser, Ivo Aschwanden, Raamadaas Krishnadas 等 5 位作者
Robotics and Computer-Integrated Manufacturing · 2026
迈向智能机器人时代:用于高级感知系统的多模态柔性触觉传感器
Sili Ding, Feng Xu, Jie Chen 等 6 位作者
Progress in Materials Science · 2026