首页 /研究 /MDFusion: Multi-Dimension Semantic–Spatial Feature Fusion for LiDAR–Camera 3D Object Detection

PERCEPTION

MDFusion: Multi-Dimension Semantic–Spatial Feature Fusion for LiDAR–Camera 3D Object Detection

Renzhong Qiao, Hao Yuan, Wenbo Zhang

发表年份: 2025
引用次数: 3
访问权限: 开放获取

摘要

Accurate 3D object detection is becoming increasingly vital for the development of robust perception systems, particularly in applications such as autonomous driving vehicles and robotic systems. Many existing approaches rely on bird’s eye view (BEV) feature maps to facilitate multi-modal interaction, as BEV representations enable efficient operations. However, the inherent sparsity of LiDAR BEV features often leads to misalignment with the dense semantic information in camera images, resulting in suboptimal fusion quality and degraded detection performance, especially in complex and dynamic environments. To mitigate these issues, this paper proposes a novel multi-dimension semantic–spatial feature fusion (MDFusion) method that combines LiDAR and image features in 2D and 3D spaces. Specifically, image semantic features are extracted using the DeepLabV3 segmentation network, which captures rich contextual information and is aligned with LiDAR point cloud voxel features through a summation operation to achieve precise semantic fusion. Additionally, LiDAR BEV features are fused with downsampled image features in 2D space via concatenation and spatially adaptive dilated convolution. The mechanism dynamically adjusts to the spatial characteristics of the data, ensuring robust feature integration. Extensive experiments on the KITTI and ONCE datasets demonstrate that our method achieves competitive performance in complex scenes, significantly improving the multi-modal fusion quality and detection accuracy while maintaining computational efficiency.

关键词

LidarComputer visionArtificial intelligenceComputer scienceDimension (graph theory)Feature (linguistics)FusionObject (grammar)Remote sensingPattern recognition (psychology)

MDFusion: Multi-Dimension Semantic–Spatial Feature Fusion for LiDAR–Camera 3D Object Detection

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory