首页 /研究 /A Self-Supervised Monocular Depth Estimation Framework Based on Detail Recovery and Feature Fusion
PERCEPTION

A Self-Supervised Monocular Depth Estimation Framework Based on Detail Recovery and Feature Fusion

S. Li, Chongzheng Huang, X. L. Li, Liang Zheng-you

发表年份
2025
引用次数
3

摘要

With the rapid development of autonomous driving and robotics, the demand for high-precision depth estimation is growing steadily. The emergence of self-supervised learning methods has reduced the reliance on expensive and hard-to-obtain ground truth data. However, existing models still face challenges in design, such as insufficient detail recovery, inadequate feature fusion, and limited global contextual understanding, which lead to deficiencies in the accuracy and precision of the generated depth maps. To address these challenges, this paper proposes a novel self-supervised monocular depth estimation network, Enhanced Depth Fusion Net (EDFNet). Based on the encoder-decoder architecture, EDFNet introduces three key modules: the Adaptive Selective Attention Module (ASAM), the Multi-Dimensional Attention Fusion Module (MDAFM), and the Hybrid Parallel Convolution Block (HPCB). Specifically, ASAM selectively emphasizes critical features and spatial locations in images to enhance the model’s ability to capture details. MDAFM captures feature information from different perspectives and levels to address the problem of critical information loss, while HPCB effectively captures and fuses local and global information through parallel processing. The synergy of these modules enables EDFNet to generate high-quality depth maps in various complex scenarios. Experimental results on the KITTI dataset demonstrate that the proposed method significantly outperforms state-of-the-art approaches across multiple evaluation metrics, including Abs Rel, Sq Rel, RMSE, δ <1.25, effectively mitigating the issues of detail loss and edge blurring in depth maps. Additionally, EDFNet demonstrates excellent generalization performance on the Make3D, NYU Depth V2, and Cityscapes datasets.

关键词

Computer scienceArtificial intelligenceMonocularFeature (linguistics)Computer visionFusionFeature extractionSensor fusionPattern recognition (psychology)

相关论文

查看 PERCEPTION 分类全部论文