From vineyard to vision: Multi-domain analysis and mitigation of grape cluster detection failures in complex viticultural environments
Shubham Rana, Oliver Hensel, Abozar Nasirahmadi
- 发表年份
- 2025
- 引用次数
- 11
摘要
• Cross-domain validation: First cross-dataset evaluation of six modern detectors across stratified RGB/NIR vineyard datasets in China, Brazil and Italy. • Orientation-aware gains: Oriented YOLOv8-OBB reduces FCR by up to 25–30% in poor NIR, rotated, and occluded grape clusters compared with horizontal baselines. • Model reliability: YOLOv10, YOLOv11, and YOLOv8-OBB maintain F1 ≥ 0.85 with low metric variance across RGB/NIR and degraded imagery, indicating strong cross-domain generalization. • Spectral findings: RGB outperforms NIR by 8–10%, but geometry and scale-aware architectures narrow this spectral gap. The False Classification Rate (FCR) better reflected real-world reliability than mAP alone. • Task-aware guidance: A decision matrix links vineyard tasks (yield estimation, precision spraying, and autonomous robotics) to recall, precision, and geometry-balanced detectors for deployment. Accurate and robust grape-cluster detection remains a persistent challenge in precision viticulture due to spectral variability, canopy occlusion, and lighting heterogeneity. Recent advancements in the YOLO series, have focused on eliminating post-processing bottlenecks like Non-Maximum Suppression (NMS) to improve inference speed. Furthermore, state-of-the-art models increasingly integrate attention-based mechanisms and hybrid transformer-CNN backbones to enhance feature representation and global context understanding, leading to greater accuracy. This study presents a comprehensive benchmark and error analysis of recent YOLO architectures (v8–v12), including an orientation-aware YOLOv8-OBB, where YOLOv11 and YOLOv12 are community implementations rather than official successors to the Ultralytics, across multispectral (RGB, NIR) vineyard datasets under both normal and degraded imaging conditions. Models were evaluated using standard metrics (Precision, Recall, F1, mAP@0.5, mAP@0.5:0.95) and False Classification Rate (FCR) that integrates false positives and negatives to capture field reliability. Results show that YOLOv10, YOLOv11, and YOLOv8-OBB deliver the highest overall stability and transfer performance, maintaining consistent F1 ≥ 0.85 across spectral regimes. RGB imagery outperforms NIR by approximately 8–10%, yet OBB regression markedly improves NIR localization, reducing FCR by up to 30% in poor-quality scenes. Cross-dataset experiments further reveal that YOLOv11 sustains the lowest metric variance, while YOLOv8-OBB achieves superior mAP@0.5:0.95 when object orientations vary. The findings emphasize that orientation-aware geometry, domain-robust feature balance, and variance-based reliability metrics are more predictive of field performance than absolute mAP values. The study provides actionable guidance for detector selection in vineyard monitoring and establishes a reproducible benchmark for multispectral object detection under real-world variability.
关键词
相关论文
Artificial intelligence: a modern approach
1995
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger, P Lenz, R. Urtasun
2012
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martı́n Abadi, Ashish Agarwal, Paul Barham 等 20 位作者
2016