Home /Research /DCFA-YOLO: A Dual-Channel Cross-Feature-Fusion Attention YOLO Network for Cherry Tomato Bunch Detection
PERCEPTION

DCFA-YOLO: A Dual-Channel Cross-Feature-Fusion Attention YOLO Network for Cherry Tomato Bunch Detection

Shanglei Chai, Ming Wen, Pengyu Li, Zhi Zeng, Yibin Tian

Year
2025
Citations
17

Abstract

To better utilize multimodal information for agriculture applications, this paper proposes a cherry tomato bunch detection network using dual-channel cross-feature fusion. It aims to improve detection performance by employing the complementary information of color and depth images. Using the existing YOLOv8_n as the baseline framework, it incorporates a dual-channel cross-fusion attention mechanism for multimodal feature extraction and fusion. In the backbone network, a ShuffleNetV2 unit is adopted to optimize the efficiency of initial feature extraction. During the feature fusion stage, two modules are introduced by using re-parameterization, dynamic weighting, and efficient concatenation to strengthen the representation of multimodal information. Meanwhile, the CBAM mechanism is integrated at different feature extraction stages, combined with the improved SPPF_CBAM module, to effectively enhance the focus and representation of critical features. Experimental results using a dataset obtained from a commercial greenhouse demonstrate that DCFA-YOLO excels in cherry tomato bunch detection, achieving an mAP50 of 96.5%, a significant improvement over the baseline model, while drastically reducing computational complexity. Furthermore, comparisons with other state-of-the-art YOLO and other object detection models validate its detection performance. This provides an efficient solution for multimodal fusion for real-time fruit detection in the context of robotic harvesting, running at 52fps on a regular computer.

Keywords

Channel (broadcasting)Dual (grammatical number)Feature (linguistics)FusionArtificial intelligenceComputer scienceTelecommunicationsArt

Related papers

Browse all PERCEPTION papers