首页 /研究 /Cherry Tomato Bunch and Picking Point Detection for Robotic Harvesting Using an RGB-D Sensor and a StarBL-YOLO Network
LEARNING

Cherry Tomato Bunch and Picking Point Detection for Robotic Harvesting Using an RGB-D Sensor and a StarBL-YOLO Network

Pengyu Li, Ming Wen, Zhi Zeng, Yibin Tian

发表年份
2025
引用次数
8
访问权限
开放获取

摘要

For fruit harvesting robots, rapid and accurate detection of fruits and picking points is one of the main challenges for their practical deployment. Several fruits typically grow in clusters or bunches, such as grapes, cherry tomatoes, and blueberries. For such clustered fruits, it is desired for them to be picked by bunches instead of individually. This study proposes utilizing a low-cost off-the-shelf RGB-D sensor mounted on the end effector and a lightweight improved YOLOv8-Pose neural network to detect cherry tomato bunches and picking points for robotic harvesting. The problem of occlusion and overlap is alleviated by merging RGB and depth images from the RGB-D sensor. To enhance detection robustness in complex backgrounds and reduce the complexity of the model, the Starblock module from StarNet and the coordinate attention mechanism are incorporated into the YOLOv8-Pose network, termed StarBL-YOLO, to improve the efficiency of feature extraction and reinforce spatial information. Additionally, we replaced the original OKS loss function with the L1 loss function for keypoint loss calculation, which improves the accuracy in picking points localization. The proposed method has been evaluated on a dataset with 843 cherry tomato RGB-D image pairs acquired by a harvesting robot at a commercial greenhouse farm. Experimental results demonstrate that the proposed StarBL-YOLO model achieves a 12% reduction in model parameters compared to the original YOLOv8-Pose while improving detection accuracy for cherry tomato bunches and picking points. Specifically, the model shows significant improvements across all metrics: for computational efficiency, model size (−11.60%) and GFLOPs (−7.23%); for pickable bunch detection, mAP50 (+4.4%) and mAP50-95 (+4.7%); for non-pickable bunch detection, mAP50 (+8.0%) and mAP50-95 (+6.2%); and for picking point detection, mAP50 (+4.3%), mAP50-95 (+4.6%), and RMSE (−23.98%). These results validate that StarBL-YOLO substantially enhances detection accuracy for cherry tomato bunches and picking points while improving computational efficiency, which is valuable for resource-constrained edge-computing deployment for harvesting robots.

关键词

Artificial intelligenceRGB color modelComputer visionRobustness (evolution)Computer scienceRobotPoint cloud

相关论文

查看 LEARNING 分类全部论文