基于RGB-D信息融合和目标检测的番茄串采摘点识别定位方法

2021-11-24陈建敏

农业工程学报 2021年18期

张勤，陈建敏，李彬，徐灿

张勤1，陈建敏1，李彬2※，徐灿3

（1. 华南理工大学机械与汽车工程学院，广州 510641；2. 华南理工大学自动化科学与工程学院，广州 510641；3. 广东省现代农业装备研究所，广州 510630）

采摘点的识别与定位是智能采摘的关键技术，也是实现高效、适时、无损采摘的重要保证。针对复杂背景下番茄串采摘点识别定位问题，提出基于RGB-D信息融合和目标检测的番茄串采摘点识别定位方法。通过YOLOv4目标检测算法和番茄串与对应果梗的连通关系，快速识别番茄串和可采摘果梗的感兴趣区域（Region of Interest，ROI）；融合RGB-D图像中的深度信息和颜色特征识别采摘点，通过深度分割算法、形态学操作、K-means聚类算法和细化算法提取果梗图像，得到采摘点的图像坐标；匹配果梗深度图和彩色图信息，得到采摘点在相机坐标系下的精确坐标；引导机器人完成采摘任务。研究和大量现场试验结果表明，该方法可在复杂近色背景下，实现番茄串采摘点识别定位，单帧图像平均识别时间为54 ms，采摘点识别成功率为93.83%，采摘点深度误差±3 mm，满足自动采摘实时性要求。

图像识别；对象识别；提取；番茄串；RGB-D图像；信息融合；目标检测；采摘点

0 引言

番茄串人工采摘作业季节性强、劳动强度大、费用高，随着番茄种植面积扩大，劳动成本逐年提高，机器人代替人工的智能采摘是未来发展方向[1-3]。采摘点识别与定位是机器人智能采摘的关键技术，也是实现高效、适时、无损采摘的重要保证。

采摘机器人需要识别定位番茄串对应果梗上的采摘点，才能实现番茄串有效采摘。由于果梗与背景颜色相近，果梗形态不规则，叶片、枝条的干扰，光照条件的不确定性，常导致图像中存在噪声和各种干扰因素，降低了采摘点识别准确率[4]。同时由于果梗细小，经济型深度相机在获取细小目标时，深度信息存在误差较大甚至缺失的情况，导致采摘机器人难以精确定位采摘点。

果梗上采摘点的识别与定位，目前主要根据果实外形特征预测定位或根据果梗与果实位置关系识别果梗，进而识别定位果梗上的采摘点。在基于果实外形特征预测定位果梗采摘点的研究上，冯青春等[5-6]针对黄瓜或草莓特征，将果实轮廓顶点上方作为采摘点。Chen等[7]根据番茄几何曲面特征识别番茄，并根据番茄生长特征预测果梗位置，实现果梗采摘点定位。Ling等[8]将识别得到的番茄圆心上方60 mm处作为采摘点。在通过果梗与果实相对位置关系识别果梗的研究上，熊俊涛等[9-12]识别果实后，根据果实质心位置预测果梗所在区域，通过直线检测算法及果梗所在直线与质心位置关系识别果梗，进而识别采摘点。熊俊涛等[13]在简单背景下，通过颜色特征识别荔枝后，根据荔枝与茎的位置关系，结合直线检测法得到茎上的采摘点。Zhuang等[14]通过颜色特征提取荔枝和枝条区域图像，利用Harris角点检测法识别果梗，结合角点与果实质心的位置关系定位采摘点。Benavides等[15]通过颜色特征识别得到番茄和果梗图像，根据番茄姿态、番茄质心和果梗的位置关系识别采摘点。梁喜凤等[16]提取番茄串区域后，根据果实质心与果梗间关系确定果梗位置，提取果梗图像并细化处理，将番茄串上第一个果实与主干间的角点作为采摘点。现有果梗采摘点识别定位方法均要求果实与果梗形态相对固定。但番茄串形态各异，果梗姿态不规则，合适的采摘点与果实质心之间无位置关系，很难通过位置关系或形态特征准确识别定位采摘点。通过点云信息检测果实和果梗及定位采摘点的方法，计算时间较长，很难满足采摘的实时性要求[17-22]。由于深度神经网络目标检测算法在快速检测目标同时拥有较高鲁棒性，近年在识别采摘点问题上也得到应用[23-24]。宁政通等[25]通过卷积神经网络和区域生长法识别葡萄采摘点，耗时4.9 s。陈燕等[26]通过YOLOv3目标检测算法得到荔枝串在图像中位置，结合双目立体视觉实现对荔枝串预定位，定位平均绝对误差为23 mm。Yu等[27]利用改进的R-YOLO目标检测算法快速识别图像中草莓及对应姿态角度，根据姿态角度预测采摘点位置。Arad等[28]通过深度学习算法识别甜椒位置，结合甜椒生长特征预测果梗位置。由于番茄串果梗纤细且颜色与主杆、叶片相似，在复杂场景下通过颜色特征或深度神经网络难以快速识别并准确提取果梗图像，因而很难精确识别采摘点；同时基于主动立体红外成像技术的经济型深度相机获取小目标物体深度信息时，获取的深度信息存在误差较大甚至信息缺失的情况，导致采摘点定位精度不足。这些问题限制了智能采摘机器人在实际番茄串采摘中的应用。

针对上述问题，本研究提出基于RGB-D信息融合和目标检测的番茄串采摘点识别定位方法，通过YOLOv4目标检测算法[29]和番茄串与对应果梗的连通关系，快速识别复杂背景下的可采摘果梗；然后，通过深度信息分割算法、形态学操作、K-means聚类算法和细化算法，实现近色背景下精确提取果梗边缘、识别采摘点；最终，通过RGB-D信息融合和目标果梗深度均值填充的定位算法，提取采摘点精确深度值，实现采摘点精确定位。

1 番茄串采摘点识别定位方法

为解决复杂背景下，识别定位番茄串采摘点的问题，提出基于RGB-D信息融合和目标检测番茄串采摘点识别定位方法，算法流程如图1所示，分为通过YOLOv4目标检测算法和番茄串与对应果梗的连通关系，快速识别可采摘果梗；通过RGB-D信息融合、综合分割算法获得采摘点坐标(P,P,P)；根据采摘点坐标，引导采摘机器人实施采摘操作3个主要部分。

2 数据样本采集与数据处理

2.1 数据集采集与构建

数据样本采集地为广东省农业技术推广总站某番茄园，研究番茄串品种为金玲珑。植株种植在桁架上，株距为33 cm，桁架行距为160 cm，番茄培养基距离高度为75～106 cm，地面铺设导轨可供采摘机器人沿轨道移动。同时，采集粤科达202番茄串和以色列红色番茄串两个品种的番茄串数据样本集作为研究方法的补充数据，各品种番茄串特征如表1所示。

表1 不同品种番茄串特征

1）数据采集条件

采集RGB图像作为番茄串和果梗YOLOv4目标检测模型训练样本集，测试采摘点识别定位方法时，需融合图像深度信息，采集RGB-D图像作为测试样本集。在采集RGB图像时，为提高番茄串和果梗目标检测YOLOv4模型的鲁棒性，使用数码相机在不同角度、不同光照条件下对番茄进行拍摄，如图2所示。在采集RGB-D图像时，使用RealSense™ Depth Camera D435i（以下简称D435i）深度相机在不同采摘拍摄位置、不同光照条件下采集番茄串RGB-D图像。采集RGB-D图像时相机输出的RGB图像和深度图分辨率均为1 280×720像素，相机帧率30帧/s。由于D435i深度相机获取的RGB图像和深度图成像来源不同，需要将深度图像的图像坐标系转换到彩色图像的图像坐标系下，匹配得到RGB图像各像素点对应深度值，该研究通过深度图对齐RGB图像的方式进行图像配准。

2）构建数据集

根据以上数据采集条件，共采集金玲珑番茄串RGB图像2 617张和RGB-D图像336张。同时，采集粤科达202的RGB图像1 027张和RGB-D图像46张；采集以色列红色番茄串RGB图像1 174张和RGB-D图像147张，作为验证该研究方法对不同品种番茄串采摘点识别定位可行性的数据集。

2.2 构建番茄串、果梗和采摘点有效区域数据样本集

2.2.1 构建番茄和果梗数据集

为提高自动采摘可实施性，数据集分为番茄串和果梗两类，番茄串和果梗数据集构建案例如图2所示，满足采摘要求的目标番茄串样本标注为“1”；目标果梗样本标注为“0”；将标注后的数据样本分为训练集和测试集，如表2所示。

表2 番茄串和果梗数据集

2.2.2 采摘点有效区域数据集构建

为测试提出方法识别采摘点准确率，对测试样本中的果梗人工进行有效区域标记。若采摘点位于果梗区域范围内，则认为采摘点有效，否则采摘点识别无效，有效区域作为采摘点识别的真实值（Ground truth），可以评价番茄串采摘点识别算法性能，可采摘果梗标记结果如表3所示。

表3 可采摘果梗数据集

3 番茄串、可采摘果梗ROI快速识别算法

为提取番茄串和果梗有区分度的特征，实现复杂背景下快速识别可采摘番茄串和果梗，采用YOLOv4目标检测算法识别番茄串和果梗，通过对输入图像全局检测，融合多尺度特征识别目标，实现番茄串ROI和果梗ROI快速检测，并通过番茄串和对应果梗连通关系，筛选出可采摘果梗ROI。

3.1 番茄串和果梗目标检测的YOLOv4模型框架

3.2 快速识别番茄串、果梗ROI

3.3 基于连通关系的可采摘果梗ROI识别

筛选可采摘果梗ROI流程如下：

4 采摘点坐标识别定位算法

果梗细小且颜色与背景颜色相近，经济型深度相机获取的深度信息精度不能满足采摘要求，为实现采摘点精确识别定位，通过RGB-D信息融合算法和综合分割算法，实现近色背景下提取果梗边缘，并精确识别定位采摘点。

4.1 背景噪声去除算法

在近色背景下果梗颜色特征不明显且图像噪声多，导致果梗难以分割提取。为解决该问题，该研究提出基于深度信息分割和形态学操作的背景噪声去除算法，结合深度信息去除复杂背景，减少果梗分割提取时的噪声，提高果梗边界分割精度。去除背景噪声时，根据番茄植株种植特点，将距离采摘机器人最近一行番茄视为前景，其他视为背景，利用前景与背景之间存在深度差的特点去除背景，深度信息分割如式（2）所示。由于深度相机在获取深度信息时会有数据误差较大或缺失问题，为保留完整果梗图像，在去除背景时对果梗区域进行形态学闭运算操作，减小背景去除量。

4.2 采摘点图像坐标识别

为准确识别果梗上的采摘点，将颜色特征作为果梗识别特征，通过K-means聚类算法对各像素点进行聚类，提取出果梗图像，再利用Zhang-Suen细化算法[30]提取果梗图像骨骼图，进而精确识别果梗上的采摘点。随机选取表2中60张图，计算得到85个去除部分背景后的可采摘果梗ROI，在每个果梗ROI中随机采样果梗和背景特征点各4个，最后共得到果梗和背景特征点各340个，将特征点转换到RGB、HSV、LAB色域进行分布统计。统计结果显示，在去除部分背景后，感兴趣果梗区域内存在的噪声大幅减少，在RGB色域内，果梗与背景有较明显分布差异，如图5所示。因此将R、G、B数值作为识别特征，结合K-means聚类算法提取果梗图像。聚类时随机选取聚类初始点，当聚类迭代次数达10次或聚类精度达1时停止运算。为提高果梗图像分割提取精度，采用2次K-means聚类算法提取果梗图像。第一次K-means聚类并计算各类占比，将小类归为背景去除部分噪声；第二次K-means聚类，通过计算各类中心点RGB值与果梗标准RGB值间的平方差，将平方差最小类识别为果梗。最终提取果梗图像，并通过形态学开运算去除噪声和孔洞。在实际采摘过程中，番茄串采摘点通常位于果梗中心位置，计算得到果梗骨骼图与轴中轴线上的交点(P,P)作为采摘点，如果得到多个交点则取交点平均值作为采摘点。

基于K-means聚类和细化算法的采摘点图像坐标(P,P)识别算法流程如下：

新工艺红茶的酚氨比为5.5，低于传统工艺的5.8，酚氨比低在感官品质方面表现为滋味醇爽，酚氨比高时滋味苦涩。

5）各类中心点的R、G、B数值与果梗标准R、G、B值平方差最小类为果梗；

7）Zhang-Suen细化算法提取果梗骨骼图；

8）计算果梗骨骼图与轴中轴线交点(P,P)，设定该点为采摘点。

4.3 提取采摘点深度值

3）第一次计算果梗深度值非0点平均值：

6）提取最优深度值：

5 基于视觉引导的机器人采摘算法

5.1 番茄串采摘机器人系统

番茄串采摘机器人系统如图6所示，由移动平台、6自由度机械臂、采摘手爪、深度相机、控制器构成。机械臂、深度相机安装在移动平台上，移动平台沿导轨移动且上下高度可调。机械臂采用AUBO-i3 机械臂，末端最大负载为3 kg，重复定位精度为±0.02 mm，最大臂展832 mm。采摘手爪采用剪切夹持一体设计，剪切手指最大开口宽度为23 mm，可剪切直径5 mm以内的果梗。深度相机采用经济型D435i（Intel Realsense），该相机价格较低、体积较小，便于复杂环境下安装使用，在小于1 m范围内，D435i获取的深度信息精度高[32]，能以30帧/s的帧率输出分辨率为1 280×720像素的RGB-D图像。控制器安装在移动平台内部，配备8 GB运行内存，采用GPU为GeForce GTX2060。

1.移动平台 2.机械臂 3.采摘手爪 4.深度相机（D435i）

5.2 机器人手眼坐标转换

再利用式（5）将采摘点坐标转换到机械臂坐标系，得到采摘点在机械臂坐标系的空间坐标。

最终，根据采摘点在机械臂坐标系的空间坐标，引导采摘机器人完成采摘动作[34]。

6 试验研究

6.1 YOLOv4模型训练和最优模型选择

使用表2中训练集训练得到番茄串和果梗目标检测YOLOv4模型，并筛选出最优模型。

1）YOLOv4模型训练环境

进行YOLOv4网络训练时，电脑主要配置为Intel i7-9750H CPU，GeForce GTX1080Ti GPU和16 GB运行内存，开发环境为Windows10（64位）系统、VS2019、C++、OpenCv4.1。

2）YOLOv4模型训练参数设置与训练

训练时YOLOv4具有3种尺度的特征图，训练前需要设置3个尺度特征图对应的9个锚点，通过K-means聚类算出的9个锚点分别为（19×20）、（12×44）、（35×26）、（23×40）、（26×73）、（41×52）、（41×92）、（46×131）、（72×177）；温室大棚环境复杂，番茄串和果梗的姿态、光照条件均具不确定性，为提高训练结果鲁棒性，训练时通过图像拼接、改变图像角度（变化范围0°～5°）、色调（变化范围1.0～1.5倍）、曝光度（变化范围1.0～1.5倍）、色量化（变化范围0.8～1.1倍）的数据增强方法扩充数据集。最终模型一共训练9 000次，通过损失值（loss）评价模型训练效果，损失值如式（6）所示。

其损失值变化如图7所示，可以看出在迭代6 000次后，模型逐渐稳定，最终损失值接近2.1。

3）YOLOv4模型评估和最优模型选择

使用平均精度均值（Mean Average Precision，mAP）来评估模型的整体性能，平均精度均值如式（7）所示[29]。

计算所得的每个模型的mAP值，结果如图8所示。可以看出，模型迭代次数在7 000～9 000次时，模型的mAP趋于稳定。在该区间内，迭代次数为8 755时mAP为最优值，为79.55%，选此模型作为最优模型。

6.2 模型识别准确率测试

对表2中测试样本进行番茄串和果梗目标检测，测试模型目标检测的准确率，检测结果如表4所示。使用精确率（Precision）、召回率（Recall）和1分数作为YOLOv4模型检测精度的评价指标，番茄串和果梗目标检测模型对番茄串识别精确率为95.4%，召回率为99.0%。该模型对果梗识别精确率为98.9%，召回率为97.2%。模型整体1分数为0.967。

番茄串和果梗YOLOv4目标检测模型准确率测试过程中，通过重叠系数OC来评价目标检测准确性。OC是检测到的目标框与真实框之间的重叠率，OC的计算式[29]为

表4 YOLOv4模型目标检测结果

当OC≥80％时，即表示检测成功。

精确率、召回率分别表示为

式中TP为将正类预测为正类数，FP为将负类预测为正类数，FN为将正类预测为负类数。

6.3 采摘点识别方法准确率测试试验

6.3.1 采摘点识别准确率测试试验

对表3中可采摘果梗进行采摘点识别方法准确率测试。根据2.2.2小节，识别到的采摘点位于采摘点有效区域则为成功，否则判定为识别失败。为测试提出方法对不同品种番茄串采摘点的识别效果，按照6.1节中方法进行数据增强并训练得到粤科达202番茄串、以色列红色番茄串YOLOv4目标检测模型，将得到的YOLOv4目标检测模型加载到提出的采摘点识别方法，进行采摘点识别测试，其中对金玲珑测试集采摘点识别过程如图9所示，测试结果如表5所示。该研究方法对金玲珑测试集采摘点识别准确率为93.83%，对以色列红色番茄串测试集采摘点识别准确率为94.87%，对粤科达202测试集采摘点识别准确率为90.19%。对于分辨率为1 280×720像素的单帧图像平均识别时间为54 ms。试验表明研究提出方法可识别不同品种番茄串采摘点，采摘点识别准确率和识别速度满足自动采摘要求。

6.3.2 采摘点识别方法性能分析对比试验

为论证该研究番茄串采摘点识别定位方法有效性，将本研究提出采摘点识别定位方法分别与文献[16]中基于颜色特征和文献[28]中单纯基于深度神经网络方法比较，如图10、图11所示，图中绿色点为采摘点。在复杂近色背景下，如图10a和图 10b所示，过多的噪声导致文献[16]方法采摘点识别精度不足，表6为提出的方法与文献[16]方法识别准确率的比较，由表可知，提出方法采摘点识别准确率为93.83%，文献[16]方法采摘点识别准确率为61.90%，提出方法能够在复杂背景下识别不同形态番茄串果梗采摘点，具有更高准确率和稳定性。由于番茄串形态差异较大，文献[28]方法很难实现对采摘点的精确定位，如图11a和图11b。

表5 采摘点识别准确率

6.4 采摘点定位精度测试试验

对2垄金玲珑番茄串进行现场采摘试验，试验地点为广东省农业技术推广总站某番茄园，机器人沿轨道行驶，视觉识别系统同时进行图像采集和目标检测任务，当识别到可采摘番茄串后，机械人停止前进，视觉识别系统再次识别目标果实并快速定位采摘点，机器人采摘算法引导机械臂完成采摘，并把果实放入果篮中，动作完成后机器人继续向前移动，采摘过程如图12所示。机器人边走边采重复上述过程。试验过程中共检测到29个可采摘目标，完成采摘28串。以剪切时果梗相对于采摘手爪剪切中心点的距离为评价指标，如图12b所示，测试番茄串采摘机器人采摘点定位精度。成功采摘的28串番茄串，采摘点深度值误差分布如表7所示，深度值误差±3 mm。

表6 2种方法采摘点识别准确率

表7 深度值误差分布

7 结论

针对复杂环境下番茄串采摘点难以识别定位的问题，提出了基于RGB-D信息融合和目标检测的番茄串采摘点识别定位方法。构建番茄串和果梗目标检测的YOLOv4模型，结合番茄串和对应果梗的连通关系，实现了在复杂背景下快速识别可采摘果梗；融合RGB-D图像深度信息和颜色信息，实现近色背景下识别采摘点；综合果梗深度信息和颜色信息，结合目标果梗深度信息均值填充算法得到采摘点精确深度值，实现采摘点精确定位。研究提出算法可在不同光线条件下，识别并定位不同姿态番茄串采摘点。

研究和大量现场试验表明，复杂环境下，该方法可实现果梗采摘点识别定位，采摘点识别成功率为93.83%，采摘点深度误差±3 mm，对分辨率为1 280×720像素的单帧图像平均识别时间为54 ms，满足自动采摘需求。通过修改YOLOv4模型训练样本集，该方法也可识别定位不同品种番茄串的采摘点。该研究以番茄串采摘点识别定位为例提出的识别定位方法，同样适用于复杂环境下其它串收果实采摘点的识别与定位。

[1] Tang Y, Chen M, Wang C, et al. Recognition and localization methods for vision-based fruit picking robots: A review[J]. Frontiers in Plant Science, 2020, 11: 510.

[2] 方建军. 移动式采摘机器人研究现状与进展[J]. 农业工程学报，2004，20(2)：273-278.

Fang Jianjun. Present situation and development of mobile harvesting robot[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2004, 20(2): 273-278. (in Chinese with English abstract)

[3] 刘继展. 温室采摘机器人技术研究进展分析[J]. 农业机械学报，2017，48(12)：1-18.

Liu Jizhan. Research progress pnalysis of robotic harvesting technologies in greenhouse[J]. Transactions of the Chinese Society for Agricultural Machinery, 2017, 48(12): 1-18. (in Chinese with English abstract)

[4] Berenstein R, Shahar O B, Shapiro A, et al. Grape clusters and foliage detection algorithms for autonomous selective vineyard sprayer[J]. Intelligent Service Robotics, 2010, 3(4): 233-243.

[5] 冯青春，袁挺，纪超，等. 黄瓜采摘机器人远近景组合闭环定位方法[J]. 农业机械学报，2011，42(2)：154-157.

Feng Qingchun, Yuan Ting, Ji Chao, et al. Feedback locating control based on close scene for cucumber harvesting robot[J]. Transactions of the Chinese Society for Agricultural Machinery, 2011, 42(2): 154-157. (in Chinese with English abstract)

[6] 王粮局，张立博，段运红，等. 基于视觉伺服的草莓采摘机器人果实定位方法[J]. 农业工程学报，2015，31(22)：25-31.

Wang Liangju, Zhang Libo, Duan Yunhong, et al. Fruit localization for strawberry harvesting robot based on visual servoing[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(22): 25-31. (in Chinese with English abstract)

[7] Chen X, Chaudhary K, Tanaka Y, et al. Reasoning-based vision recognition for agricultural humanoid robot toward tomato harvesting[C]//2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Hamburg, Germany: IEEE, 2015.

[8] Ling X, Zhao Y, Gong L, et al. Dual-arm cooperation and implementing for robotic harvesting tomato using binocular vision[J]. Robotics and Autonomous Systems, 2019, 114: 134-143.

[9] 熊俊涛，邹湘军，彭红星，等. 扰动柑橘采摘的实时识别与采摘点确定技术[J]. 农业机械学报，2014，45(8)：38-43.

Xiong Juntao, Zou Xiangjun, Peng Hongxing, et al. Real-time identification and picking point localization of disturbance citrus picking[J]. Transactions of the Chinese Society for Agricultural Machinery, 2014, 45(8): 38-43. (in Chinese with English abstract)

[10] Luo L, Tang Y, Lu Q, et al. A vision methodology for harvesting robot to detect cutting points on peduncles of double overlapping grape clusters in a vineyard[J]. Computers in Industry, 2018, 99: 130-139.

[11] Xiong J, Liu Z, Lin R, et al. Green grape detection and picking-point calculation in a night-time natural environment using a charge-coupled device (CCD) vision sensor with artificial illumination[J]. Sensors (Basel), 2018, 18(4): 969.

[12] 罗陆锋，邹湘军，熊俊涛，等. 自然环境下葡萄采摘机器人采摘点的自动定位[J]. 农业工程学报，2015，31(2)：14-21.

Luo Lufeng, Zou Xiangjun, Xiong Juntao, et al. Automatic positioning for picking point of grape picking robot in natural environment[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(2): 14-21. (in Chinese with English abstract)

[13] 熊俊涛，邹湘军，陈丽娟，等. 采摘机械手对扰动荔枝的视觉定位[J]. 农业工程学报，2012，28(14)：36-41.

Xiong Juntao, Zou Xiangjun, Chen Lijuan, et al. Visual position of picking manipulator for disturbed litchi[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2012, 28(14): 36-41. (in Chinese with English abstract)

[14] Zhuang J, Hou C, Tang Y, et al. Computer vision-based localisation of picking points for automatic litchi harvesting applications towards natural scenarios[J]. Biosystems Engineering, 2019, 187: 1-20.

[15] Benavides M, Cantón-Garbín M, Sánchez-Molina J A, et al. Automatic tomato and peduncle location system based on computer vision for use in robotized harvesting[J]. Applied Sciences, 2020, 10(17): 5887.

[16] 梁喜凤，金超杞，倪梅娣，等. 番茄果实串采摘点位置信息获取与试验[J]. 农业工程学报，2018，34(16)：163-169.

Liang Xifeng, Jin Chaoqi, Ni Meidi, et al. Acquisition and experiment on location information of picking point of tomato fruit clusters[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(16): 163-169. (in Chinese with English abstract)

[17] Yoshida T, Fukao T, Hasegawa T. Cutting point detection using a robot with point clouds for tomato harvesting[J]. Journal of Robotics and Mechatronics, 2020, 32(2): 437-444.

[18] Sa I, Lehnert C, English A, et al. Peduncle detection of sweet pepper for autonomous crop harvesting - combined colour and 3D information[J]. IEEE Robotics and Automation Letters, 2017; 2(2): 765-772.

[19] Nguyen T, Vandevoorde K, Wouters N, et al. Detection of red and bicoloured apples on tree with an RGB-D camera[J]. Biosystems Engineering, 2016, 146: 33-44.

[20] 麦春艳，郑立华，孙红，等. 基于RGB-D相机的果树三维重构与果实识别定位[J]. 农业机械学报，2015，46(S1)：35-40.

Mai Chunyan, Zheng Lihua, Sun Hong, et al. Research on 3D reconstruction of fruit tree and fruit recognition and location method based on RGB-D camera[J]. Transactions of the Chinese Society for Agricultural Machinery, 2015, 46(S1): 35-40. (in Chinese with English abstract)

[21] Ge Y, Xiong Y, From P J. Symmetry-based 3D shape completion for fruit localisation for harvesting robots[J]. Biosystems Engineering, 2020, 197: 188-202.

[22] Yoshida T, Fukao T, Hasegawa T. A tomato recognition method for harvesting with robots using point clouds[C]//2019 IEEE/SICE International Symposium on System Integration (SII). Paris, France: IEEE, 2019.

[23] Yin W, Wen H, Ning Z, et al. Fruit detection and pose estimation for grape cluster-harvesting robot using binocular imagery based on deep neural networks[J]. Front Robot AI, 2021, 8: 626989.

[24] Zheng C, Chen P, Pang J, et al. A mango picking vision algorithm on instance segmentation and key point detection from RGB images in an open orchard[J]. Biosystems Engineering, 2021, 206: 32-54.

[25] 宁政通，罗陆锋，廖嘉欣，等. 基于深度学习的葡萄果梗识别与最优采摘定位[J]. 农业工程学报，2021，37(9)：222-229.

Ning Zhengtong, Luo Lufeng, Liao Jiaxi, et al. Recognition and the optimal picking point location of grape stems based on deep learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(9): 222-229. (in Chinese with English abstract)

[26] 陈燕，王佳盛，曾泽钦，等. 大视场下荔枝采摘机器人的视觉预定位方法[J]. 农业工程学报，2019，35(23)：48-54.

Chen Yan, Wang Jiasheng, Zeng Zeqin, et al. Vision pre-positioning method for litchi picking robot under large field of view[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(23): 48-54. (in Chinese with English abstract)

[27] Yu Y, Zhang K, Liu H, et al. Real-time visual localization of the picking points for a ridge-planting strawberry harvesting robot[J]. IEEE Access, 2020, 8: 116556-116568.

[28] Arad B, Balendonck J, Barth R, et al. Development of a sweet pepper harvesting robot[J]. Journal of Field Robotics, 2020, 37(6): 1027-1039.

[29] Bochkovskiy A, Wang C, Liao H M. YOLOv4: Optimal speed and accuracy of object detection[DB/OL]. arXiv, 2020. [2020-04-23]. https: //arxiv. org/abs/2004.10934.

[30] Zhang T Y, Suen C Y. A fast parallel algorithm for thinning digital patterns[J]. Communications of the ACM. 1984, 27(3): 236-239.

[31] Zhao Y, Gong L, Huang Y, et al. A review of key techniques of vision-based control for harvesting robot[J]. Computers and Electronics in Agriculture, 2016, 127: 311-323.

[32] Halmetschlager-Funek G, Suchi M, Kampel M, et al. An empirical evaluation of ten depth cameras: bias, precision, lateral noise, different lighting conditions and materials, and multiple sensor setups in indoor environments[J]. IEEE Robotics & Automation Magazine, 2019, 26(1): 67-77.

[33] Zhang Z. A flexible new technique for camera calibration[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(11): 1330-1334.

[34] 张勤，刘丰溥，蒋先平，等. 番茄串收机械臂运动规划方法与试验[J]. 农业工程学报，2021，37(9)：149-156.

Zhang Qin, Liu Fengpu, Jiang Xianping, et al. Motion planning method and experiments of tomato bunch harvesting manipulator[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(9): 149-156. (in Chinese with English abstract)

Method for recognizing and locating tomato cluster picking points based on RGB-D information fusion and target detection

Zhang Qin1, Chen Jianmin1, Li Bin2※, Xu Can3

(1.,,510641,; 2.,,510641,; 3.,,510630,)

Spatial position and coordinate points (called picking points) can widely be visualized in intelligent robots for fruit picking in mechanized modern agriculture. Recognition and location of picking points have also been the key technologies to guarantee the efficient, timely, and lossless picking during fruit harvesting. A tomato cluster can be both mature and immature tomato fruits, particularly in various shapes. Meanwhile, the color of fruit stem is similar to that of branches and leaves, while, the shape of fruit stems and petioles are similar. As such, there are large depth value errors or even a lack of depth values captured by the economical RGB-D depth camera using active stereo technology. Therefore, it is very difficult for picking robots to identify the picking points of tomato clusters in a complex planting environment. In this study, a recognition and location algorithm was proposed for the picking points of tomato clusters using RGB-D information fusion and target detection. Firstly, the Region of Interest (ROIs) of tomato clusters and stems were collected via the YOLOv4 target detection, in order to efficiently locate picking targets. Then, the ROIs of pickable stems that connected to the ripe tomato cluster were determined by screening, according to the neighbor relationship between the tomato clusters and stems. Secondly, the comprehensive segmentation was selected using RGB-D information fusion, thereby to accurately recognize the picking points of stems against the ROI color background. Specifically, the tomato clusters from the nearest row were regarded as the foreground in the RGB-D image, while the rest were assumed as the background (i.e., noise), due mainly to only that the nearest row for picking in robots. After that, the depth information segmentation and morphological operations were combined to remove the noise in the pickable stem ROI of RGB images. Subsequently, the pickable stem edges were extracted from the stem ROI using K-means clustering, together with morphological operation and RGB color features. The center point of skeleton along theaxis was set as the picking point (,) in image coordinate system, especially after extracting the skeleton of stem via the thinning operation. Thirdly, the RGB image and depth map of pickable stem ROI were fused to locate the picking point. Specifically, the average depth of pickable stem was calculated using the depth information of the whole pickable stem without the noise under the mean filter. Correspondingly, an accurate depth value of picking point was obtained to compare the average with the original. Finally, the picking point was converted to the robot coordinate system from image one. Eventually, the harvesting robot implemented the picking action, according to the coordinates of picking point. A field test was also conducted to verify, where the average runtime of one image was 54 ms, while the picture resolution was 1 280×720, the recognition rate of picking points was 93.83%, and the depth value error of picking point was ±3 mm. Thus, the proposed algorithm can fully meet the practical requirements during field operation in harvesting robots.

image recognition; object recognition; extraction; tomato cluster; RGB-D image; information fusion; target detection; picking point

张勤，陈建敏，李彬，等. 基于RGB-D信息融合和目标检测的番茄串采摘点识别定位方法[J]. 农业工程学报，2021，37(18)：143-152.doi：10.11975/j.issn.1002-6819.2021.18.017 http://www.tcsae.org

Zhang Qin, Chen Jianmin, Li Bin, et al. Method for recognizing and locating tomato cluster picking points based on RGB-D information fusion and target detection[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(18): 143-152. (in Chinese with English abstract) doi：10.11975/j.issn.1002-6819.2021.18.017 http://www.tcsae.org

2021-03-12

2021-05-26

广东省重点领域研发计划资助（2019B020222002）；2019年广东省乡村振兴战略专项（粤财农[2019]73号）；广东省现代农业产业共性关键技术研发创新团队建设项目（2019KJ129）

张勤，博士，教授，博士生导师，研究方向为机器人及其应用。Email：zhangqin@scut.edu.cn

李彬，博士，副教授，研究方向为图像处理与模式识别、机器学习、人工智能。Email：binlee@scut.edu.cn

10.11975/j.issn.1002-6819.2021.18.017

TP391.4

1002-6819(2021)-18-0143-10