大视场下荔枝采摘机器人的视觉预定位方法

2019-02-20王佳盛曾泽钦邹湘军陈明猷

农业工程学报 2019年23期

陈燕，王佳盛，曾泽钦，邹湘军，陈明猷

大视场下荔枝采摘机器人的视觉预定位方法

陈燕，王佳盛，曾泽钦，邹湘军※，陈明猷

（1. 华南农业大学工程学院，广州 510642； 2. 华南农业大学南方农业机械与装备关键技术教育部重点实验室，广州 510642）

机器人采摘荔枝时需要获取多个目标荔枝串的空间位置信息，以指导机器人获得最佳运动轨迹，提高效率。该文研究了大视场下荔枝采摘机器人的视觉预定位方法。首先使用双目相机采集荔枝图像；然后改进原始的YOLOv3网络，设计YOLOv3-DenseNet34荔枝串检测网络；提出同行顺序一致性约束的荔枝串配对方法；最后基于双目立体视觉的三角测量原理计算荔枝串空间坐标。试验结果表明，YOLOv3-DenseNet34网络提高了荔枝串的检测精度与检测速度；平均精度均值（mean average precision，mAP）达到0.943，平均检测速度达到22.11帧/s。基于双目立体视觉的荔枝串预定位方法在3 m的检测距离下预定位的最大绝对误差为36.602 mm，平均绝对误差为23.007 mm，平均相对误差为0.836%，满足大视场下采摘机器人的视觉预定位要求，可为其他果蔬在大视场下采摘的视觉预定位提供参考。

机器人；图像处理；目标检测；荔枝采摘；大视场；卷积神经网络；立体视觉

0 引言

研发荔枝采摘机器人，实现荔枝采摘的自动化与智能化，是解决国内的荔枝采摘作业自动化程度低的重要途径。视觉系统是荔枝采摘机器人的重要组成部分[1]，以机器视觉为主的定位技术近年来被广泛应用到农业领域[2-4]。视觉定位算法是视觉系统的关键，其性能直接影响荔枝采摘机器人的采摘效率和质量。因此荔枝采摘视觉定位技术具有重要研究意义。

华南农业大学邹湘军教授团队对荔枝视觉采摘机器人开展了大量的研究[5-8]。该团队提出了自然环境下的荔枝分割方法[9-12]。此外，国内还有许多研究者在各类果实采摘的识别定位进行了研究[13-16]。但上述研究都是基于小视场、仅有一两串荔枝的场景。而借鉴国外果蔬采摘的经验[17-18]，在机器人到达作业范围之前对荔枝树整体的果实分布做预定位，可指导机器人运动到采摘位置，再进行精确的采摘点定位，从而提高机器人采摘效率。大视场是指相机的视野覆盖范围较广，但是在这个条件下，相机视野范围内会出现多串荔枝，这提高了荔枝串的定位难度。因此，有必要对大视场下荔枝采摘机器人的视觉预定位进行研究。

近年来，随着深度学习，特别是卷积神经网络的发展，有许多学者利用卷积神经网络进行分类、分割、识别与检测[19-32]。如文献[20]在VGGNet的基础上优化网络结构，提高番茄主要器官的特征提取能力，并通过Selective Search生产检测区域，实现不同种类、不同成熟度的番茄主要器官的检测。文献[28-29]分别使用YOLO算法对采摘目标进行了识别、定位并取得不错的结果。因此，使用深度学习方法有助于荔枝果串的预定位。

1 材料与方法

1.1 试验设备

试验设备由硬件设备与软件组成，硬件设备主要包括：2台GigE工业相机构成的双目立体视觉系统，型号为维视 MV-EM200C，分辨率1600×1200像素，帧率60帧/s，镜头焦距为16 mm；博世激光测距仪，型号为GLM50，有效测量范围0.05～50 m，测量精度±1.5 mm；维视高精度圆点标定板，圆点数量为9×11个，圆心距离（30±0.01）mm；笔记本电脑，主要配置：i7-7700HQ处理器；16 G，2 400 MHz内存；GTX1060 6G显卡。

软件系统主要以OpenCV视觉库与DarkNet深度学习框架为基础编写而成。

1.2 图像与数据采集

在拍照采样之前，需要对双目立体视觉系统进行标定。根据三角测量原理，基线距离越大，测量精度越高，但是基线距离越大，2个相机的公共视场越小。为了保证在较高的精度下有较大的公共视场，经过多次调试后选择基线距离为110 mm。为确保图像的准确度，相机标定在大视场范围内进行，即相机与目标果实的距离为2.5～3 m。在采集图像前，使用圆点标定板完成相机双目立体视觉系统的标定。

试验图像的拍摄时间为2018年6－7月，拍摄地点为广州市增城区和广州市从化区。在野外环境下采集大视场范围下的荔枝图像，并用激光测距仪测量荔枝串的距离，用于与本文算法所得结果进行比对。共采集双目图像250对。由于样本数据较小，容易出现过拟合，因此需要对原图与极线校正后的图像使用了小范围的随机裁剪、缩放对样本进行扩充，最终的图片数据集为4 000张。最后借助开源工具LabelImg制作目标检测网络的数据集。

1.3 荔枝串预定位方法

在大视场条件下，图像背景复杂，如果直接对全图进行稠密立体匹配，匹配效率低且效果差。另外，如图1中蓝色框与红色框所示，部分荔枝串无法完全同时出现在公共视场中，这会影响荔枝串图像的模板匹配，从而难以准确定位荔枝串。因此，本文首先对左、右目图像做目标检测，在目标检测的基础上提出基于同行顺序一致性约束的荔枝串配对算法，根据三角测量原理，以各串荔枝中心的视差计算出荔枝串的三维空间坐标。

注：黄色框表示公共视场中图像完整；蓝色框表示公共视场中图像有部分缺失；红色框表示公共视场中图像完全缺失。

1.3.1 荔枝串目标检测

借鉴YOLOv3[30]目标检测网络以及DenseNet[31]分类网络，并结合荔枝串检测任务的场景单一（仅为果园环境）、目标单一的特点优化网络结构，设计了深度为34层的密集卷积层（下文称为Dense Module），基于Dense Module设计荔枝串检测网络YOLOv3-DenseNet34。

由卷积层（convolution，Conv），批归一化层（batch normalization，BN）以及激活层（leaky ReLU）组成一个基本组件层（DarkNet convolution, batch normalization, leaky ReLU, 下文称为DBL）（如图2左下角），其中DBL(1×1)指卷积层的卷积核大小为1×1。多个DBL层组成一个DBL模块（如图2右下角）；多个DBL模块组成Dense Module，模块之间的连接模式如图2所示。

图2 Dense Module结构示意图

YOLOv3-DenseNet34的先验框尺寸通过对样本集所有图像中荔枝的宽高进行K-means聚类获得。根据样本的尺度分布，聚类时选取聚类数为6。最终得到的先验框聚类结果为(20, 20)，(33, 27)，(26, 39)，(48, 49)，(32, 56)，(57, 95)。

根据上述聚类结果可知，最大的先验框边长为95，使用3×3卷积的感受野，可知YOLOv3-DenseNet34的下采样次数为5。

为了不损失原始数据，YOLOv3-DenseNet34使用步长为2的卷积来代替最大池化（max pooling）进行下采样。下采样次数与卷积感受野、先验框边长存在以下关系：

式中为卷积的感受野尺寸；为下采样次数；为最大的先验框边长。

本文设计的YOLOv3-DenseNet34目标检测网络结构如图3所示。其中DBL（步长=2）即为代替下采样的卷积层。该网络使用包含4个Dense Module的34层卷积backbone提取多尺度特征，使用3个不同尺度的特征图做预测输出，即图3中的1，2，3，其中1、2、3分别下采样5、4、3次。每个尺度预测2个输出，每个输出包含目标的位置坐标和尺度在不同方向上的偏移量、置信度和目标类别的one-hot共6个数据，因此预测输出的深度均为12。

1.3.2 基于双目立体视觉的荔枝串预定位

完成相机的单目与双目标定后，需要对左右图像对应点做立体匹配，然后计算匹配点视差，最后根据三角测量原理计算匹配点的三维坐标。如果对整幅大视场的荔枝图像进行稠密立体匹配，计算量会很大，并且容易出现误匹配，即使完成了全局的立体匹配，仍然不能得到各串荔枝的位置信息。

检测到图像中的荔枝串后，可使用直接模板匹配方法，直接以左目图像的荔枝串检测结果为模板，在右目图像上做模板匹配，将匹配得分最高的点作为匹配点，从而实现稀疏的立体匹配。但是直接模板匹配需要对左目图像中的每个荔枝串都在整幅右目图像上做搜索，计算量大，并且容易出现误匹配，如图4所示。图4中左右图像中相同的数字代表直接模板匹配算法认为是同一荔枝串的区域。可以明显看出第5、6、8串荔枝出现了误匹配。

为了解决上述问题，在完成荔枝串检测的基础上，提出基于同行顺序一致性约束的一种稀疏立体匹配算法。同行顺序一致性约束的荔枝串配对方法是在外极线矫正后进行。以左目图像的目标检测结果为模板，根据行约束在同行内搜索匹配图像，以减小搜索范围。另外对于光轴平行式双目立体视觉模型，空间点在右目图像的轴坐标一定比左目图像的小。因此，如果模板图像在左目图像的右下角的横坐标为x，则它在右目图像的搜索范围x可限制在0~x之间，这样可以进一步减小搜索范围。基于同行顺序一致性约束的匹配方法可以减少搜索范围，提高匹配速度，减少误匹配。同行顺序一致性约束的匹配范围如图5所示。

图3 YOLOv3-DenseNet34网络结构示意图

注：黄色框表示荔枝串在左目图像中的检测状况；紫色框表示荔枝串在右目图像中的检测状况。

注：xl为目标在左目图上的横坐标；xr为目标在右目图像上的横坐标。

为了剔除同行顺序一致性匹配方法的误匹配，计算每个候选匹配区域与右目图像目标检测结果的重合度，每个候选匹配区域保留重合度最大的目标检测结果作为其配对结果。然后剔除不重合或者重合度极低（IoU<0.2）的配对。

最后，对上述匹配结果修正。在右目图像上取配对结果的重合区域（图6b中白色框）作为模板，在左目图像上用基于同行顺序一致性约束的匹配方法进行模板匹配。但此时约束范围稍有变化：假设重合区域在右目图像的左上角横坐标为x，则滑窗检索的范围在与重合区域同行的(x,)内，其中为图像宽度。修正效果如图6中白色框所示。

注：黄色框表示目标检测结果；紫色框表示左目图像目标检测结果在右目图像上的匹配状况；白色框表示左、右目图像的匹配结果重合区域。

1.3.3 亚像素视差计算

荔枝串配对后，需确定匹配点用于计算视差。配对框大小相同时，左、右目图像的中心点视差与配对框左上角的视差一致。为克服视差的计算结果为像素级，设计了一种计算亚像素级视差的方法，流程如下：先计算配对框的相似度和视差；然后计算相像素级精度下邻视差的匹配相似度，此时包含原匹配点和相似度总共可以确定视差-相似度平面内的3个点（如图7点1、2、3），这3个点可以唯一确定一条二次曲线（如图7曲线）。最后求解该二次曲线顶点（如图7点），顶点的横坐标即为亚像素精度下的视差。得到视差后即可计算匹配点的三维空间坐标。

注：p1、p2、p3为原匹配点和相似度所确定视差-相似度平面内的3个点；t为二次曲线的顶点。

1.3.4 预定位误差计算

匹配点在左相机坐标系下的坐标是无法直接测量的，故无法直接计算3个坐标值之间的误差。因此采用空间点的距离误差来衡量定位误差。具体计算方法如下：

式中为测量误差，mm；为视觉测量距离，mm；d激光测量距离，mm；,,为视觉测量激光点的坐标值，mm。

2 试验结果与分析

试验数据的采集时间、地点以及采集设备同1.1、1.2节。试验过程中，首先完成双目立体视觉的标定。然后调整三脚架云台的位置，使激光点落在某串荔枝果实上，并锁死三脚架云台，待激光测距仪数值稳定后记录激光测量距离t，同时让2台相机同时采样荔枝图像。不断重复上述过程，共记录30组数据和30对荔枝图像。30对荔枝图像均以图像激光点为中心，选取一个固定大小的区域作为目标检测结果，然后使用基于同行顺序一致性约束的匹配方法进行匹配和视差计算，计算匹配点三维坐标并得出荔枝果实到相机的距离。最后计算与t之间的误差。

2.1 荔枝串检测网络性能试验分析

DarkNet[29]是YOLOv3的骨干网络，用于进行网络构建与训练，DarkNet53与DenseNet34的训练参数设置如表1所示。

表1 网络训练参数设置

根据前人研究[29-31]，采用Loss值表示损失状况，可用于衡量网络的正确性与收敛状况。本文网络训练过程中前1 000次迭代的Loss数值很大而且没有意义，曲线从第1 000次迭代开始记录，如图8所示。

由图8可知，2种网络在前2 000次迭代中迅速拟合，之后偏向稳定，YOLOv3-DenseNet34的Loss值比原始网络下降慢，但最后均能收敛。表明本文所设计的网络结构可靠。

图8 荔枝串检测网络训练过程Loss曲线

使用平均精度均值[33-34]（mean average precision，mAP）指标来衡量荔枝串检测精度，它能很好地反映目标检测网络的识别能力，是目前目标检测领域最常用的指标。用帧率（frame per second，FPS）来表示模型的检测速度。其中mAP计算公式如下：

式中为准确率；tp为正例正确地分类为正例的数量；fp为负例错误地分类为正例的数量；A为平均精度；为识别图像总数；mAP为平均精度均值；C为识别类别总数。

统计试验所得mAP、平均检测速度与模型大小，结果如表2所示。

表2 荔枝串检测网络的性能对比

由表2可知，YOLOv3-DenseNet34检测速度比原始的YOLOv3提高约0.6倍，达到22.11帧/s，同时mAP提高5.6%，达到0.943，模型大小只有9.3 MB，仅为原始网络的1/26。由此可见，本文改进的荔枝串检测网络YOLOv3-DenseNet34与原始YOLOv3模型在检测速度与检测精度以及模型参数大小上都有改进和提高。

2.2 双目立体视觉荔枝预定位精度试验分析

荔枝串预定位的激光测量值、视觉测量值、测量误差等数据如表3所示。计算可得双目立体视觉荔枝串预定位的最大绝对误差为33.602 mm，平均绝对误差为23.007 mm，标准差为7.434 mm，平均相对误差为0.836%，表明本文方法检测精度高，满足预定位要求。

表3 荔枝预定位视觉测量值及其误差

3 结论

本文研究了大视场下荔枝采摘机器人视觉预定位方法。通过改进的原始的YOLOv3，设计了荔枝串检测网络YOLO-DenseNet34；提出了同行顺序一致性约束的荔枝串配对方法；最后基于双目立体视觉的三角测量原理计算荔枝串空间坐标。试验结果表明，YOLOv3-DenseNet34网络提高了荔枝串的检测精度与检测速度；mAP值达到0.943，平均检测速度达到22.11帧/s。基于双目立体视觉的荔枝串预定位方法在3 m的检测距离下预定位的最大绝对误差为36.602 mm，平均绝对误差为23.007 mm，平均相对误差为0.836%。本文所研究的大视场下荔枝采摘机器人视觉预定位方法在精度与速度上都能满足大视场下采摘视觉预定位要求，可为其他果蔬大视场下采摘的视觉预定位提供参考。

[1]程祥云，宋欣. 果蔬采摘机器人视觉系统研究综述[J]. 浙江农业科学，2019，60(3)：490－493.

Cheng Xiangyun, Song Xin. A review of research on vision system of fruit and vegetable picking robot[J]. Journal of Zhejiang Agricultural Sciences. 2019, 60(3): 490－493. (in Chinese with English abstract)

[2]罗陆锋，邹湘军，程堂灿，等. 采摘机器人视觉定位及行为控制的硬件在环虚拟试验系统设计[J]. 农业工程学报，2017，33(4)：39－46.

Luo Lufeng, Zou Xiangjun, Cheng Tangcan, et al. Design of virtual test system based on hardware-in-loop for picking robot vision localization and behavior control[J].Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(4): 39－46. (in Chinese with English abstract)

[3]熊俊涛，何志良，汤林越，等. 非结构环境中扰动葡萄采摘点的视觉定位技术[J]. 农业机械学报，2017，48(4)：29－33，81.

Xiong Juntao, He Zhiliang, Tang Linyue, et al. Visual localization of disturbed grape picking point in non-structural environment[J]. Transactions of the Chinese Society for Agricultural Machinery, 2017, 48(4): 29－33, 81. (in Chinese with English abstract)

[4]朱镕杰，朱颖汇，王玲，等. 基于尺度不变特征转换算法的棉花双目视觉定位技术[J]. 农业工程学报，2016，32(6)：182－188.

Zhu Rongjie, Zhu Yinghui, Wang Ling, et al. Cotton positioning technique based on binocular vision with implementation of scale invariant feature transform algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(6): 182－188. (in Chinese with English abstract)

[5]叶敏，邹湘军，罗陆锋，等. 荔枝采摘机器人双目视觉的动态定位误差分析[J]. 农业工程学报，2016，32(5)：50－56.

Ye Min, Zou Xiangjun, Luo Lufeng, et al. Error analysis of dynamic localization tests based on binocular stereo vision on litchi harvesting manipulator[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(5): 50－56. (in Chinese with English abstract)

[6]Zou X, Ye M, Luo C, et al. Fault-tolerant design of a limited universal fruit-picking end-effector based on vision-positioning error[J]. Applied Engineering in Agriculture, 2016, 32(1): 5－18.

[7]Zou X, Zou H, Lu J. Virtual manipulator-based binocular stereo vision positioning system and errors modelling[J]. Machine Vision and Applications. 2012, 23(1): 43－63.

[8]陈燕，邹湘军，徐东风，等. 荔枝采摘机械手机构设计及运动学仿真[J]. 机械设计，2010，27(5)：31－34.

Chen Yan, Zou Xiangjun, Xu Dongfeng, et al. Mechanism design and kinematics simulation of litchi picking manipulator[J]. Journal of Machine Design, 2010, 27(5): 31－34. (in Chinese with English abstract)

[9]熊俊涛，邹湘军，陈丽娟，等. 基于机器视觉的自然环境中成熟荔枝识别[J]. 农业机械学报，2011，42(9)：162－166.

Xiong Juntao, Zou Xiangjun, Chen Lijuan, et al. Recognition of mature litchi in natural environment based on machine vision[J]. Transactions of the Chinese Society for Agricultural Machinery, 2011, 42(9): 162－166. (in Chinese with English abstract)

[10]郭艾侠，邹湘军，朱梦思，等. 基于探索性分析的的荔枝果及结果母枝颜色特征分析与识别[J]. 农业工程学报，2013，29(4)：191－198.

Guo Aixia, Zou Xiangjun, Zhu Mengsi, et al. Color feature analysis and recognition for litchi fruits and their main fruit bearing branch based on exploratory analysis[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2013, 29(4): 191－198. (in Chinese with English abstract)

[11]熊俊涛，邹湘军，王红军，等. 基于Retinex图像增强的不同光照条件下的成熟荔枝识别[J]. 农业工程学报，2013，29(12)：170－178.

Xiong Juntao, Zou Xiangjun, Wang Hongjun, et al. Recognition of ripe litchi in different illumination conditions based on Retinex image enhancement[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2013, 29(12): 170－178. (in Chinese with English abstract)

[12]彭红星，邹湘军，陈丽娟，等. 基于双次Otsu算法的野外荔枝多类色彩目标快速识别[J]. 农业机械学报，2014，45(4)：61－68.

Peng Hongxing, Zou Xiangjun, Chen Lijuan, et al. Fast recognition of multiple color targets of litchi image in field environment based on double otsu algorithm[J]. Transactions of the Chinese Society for Agricultural Machinery, 2014, 45(4): 61－68. (in Chinese with English abstract)

[13]Fu Longsheng, Tola Elkamil, Al-Mallahi Ahmad, et al. A novel image processing algorithm to separate linearly clustered kiwifruits[J]. Biosystems Engineering, 2019, 183: 184－195.

[14]傅隆生，孙世鹏，Vázquez-Arellano Manuel，等. 基于果萼图像的猕猴桃果实夜间识别方法[J]. 农业工程学报，2017，33(2)：199－204.

Fu Longsheng, Sun Shipeng, Vázquez-Arellano Manuel, et al. Kiwifruit recognition method at night based on fruit calyx image[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(2): 199－204. (in Chinese with English abstract)

[15]梁喜凤，金超杞，倪梅娣，等. 番茄果实串采摘点位置信息获取与试验[J]. 农业工程学报，2018，34(16)：163－169.

Liang Xifeng, Jin Chaoqi, Ni Meidi, et al. Acquisition and experiment on location information of picking point of tomato fruit clusters[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(16): 163－169. (in Chinese with English abstract)

[16]李寒，张漫，高宇，等. 温室绿熟番茄机器视觉检测方法[J]. 农业工程学报，2017，33(增刊1)：328－334，388.

Li Han, Zhang Man, Gao Yu, et al. Green ripe tomato detection method based on machine vision in greenhouse[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(Supp.1): 328－334, 388. (in Chinese with English abstract)

[17]Van Henten E J, Van Tuijl B A J, Hemming J, et al. Field Test of an Autonomous Cucumber Picking Robot[J]. Biosystems Engineering. 2003, 86(3): 305－313.

[18]Mehta S S, Burks T F. Vision-based control of robotic manipulator for citrus harvesting[J]. Computers and Electronics in Agriculture. 2014, 102: 146－158.

[19]薛金林，闫嘉，范博文. 多类农田障碍物卷积神经网络分类识别方法[J]. 农业机械学报，2018，49(S1)：35－41.

Xue Jinlin, Yan Jia, Fan Bowen. Classification and identification method of multiple kinds of farm obstacles based on convolutional neural network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49(S1): 35－41. (in Chinese with English abstract)

[20]周云成，许童羽，郑伟，等. 基于深度卷积神经网络的番茄主要器官分类识别方法[J]. 农业工程学报，2017，33(15)：219－226.

Zhou Yuncheng, Xu Tongyu, Zheng Wei, et al. Classification and recognition approaches of tomato main organs based on DCNN[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(15): 219－226. (in Chinese with English abstract)

[21]傅隆生，冯亚利，Elkamil Tola，等. 基于卷积神经网络的田间多簇猕猴桃图像识别方法[J]. 农业工程学报，2018，34(2)：205－211.

Fu Longsheng, Feng Yali, Elkamil Tola, et al. Image recognition method of multi-cluster kiwifruit in field based on convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(2): 205－211. (in Chinese with English abstract)

[22]陈锋军，王成翰，顾梦梦，等. 基于全卷积神经网络的云杉图像分割算法[J]. 农业机械学报，2018，49(12)：188－194.

Chen Fengjun, Wang Chenghan, Gu Mengmeng, et al. Spruce image segmentation algorithm based on fully convolutional networks[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49(12): 188－194. (in Chinese with English abstract)

[23]韩巧玲，赵玥，赵燕东，等. 基于全卷积网络的土壤断层扫描图像中孔隙分割[J]. 农业工程学报，2019，35(2)：128－133.

Han Qiaoling, Zhao Yue, Zhao Yandong, et al. Soil pore segmentation of computed tomography images based on fully convolutional network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(2): 128－133. (in Chinese with English abstract)

[24]高云，郭继亮，黎煊，等. 基于深度学习的群猪图像实例分割方法[J]. 农业机械学报，2019，50(4)：179－187.

Gao Yun, Guo Jiliang, Li Xuan, et al. Instance-level segmentation method for group pig images based on deep learning[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(4): 179－187. (in Chinese with English abstract)

[25]王丹丹，何东健. 基于R-FCN深度卷积神经网络的机器人疏果前苹果目标的识别[J]. 农业工程学报，2019，35(3)：156－163.

Wang Dandan, He Dongjian. Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 156－163. (in Chinese with English abstract)

[26]杨国国，鲍一丹，刘子毅. 基于图像显著性分析与卷积神经网络的茶园害虫定位与识别[J]. 农业工程学报，2017，33(6)：156－162.

Yang Guoguo, Bao Yidan, Liu Ziyi. Localization and recognition of pests in tea plantation based on image saliency analysis and convolutional neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(6): 156－162. (in Chinese with English abstract)

[27]毕松，高峰，陈俊文，等. 基于深度卷积神经网络的柑橘目标识别方法[J]. 农业机械学报，2019，50(5)：181－186.

Bi Song, Gao Feng, Chen Junwen, et al. Detection method of citrus based on deep convolution neural network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(5): 181－186. (in Chinese with English abstract)

[28]赵德安，吴任迪，刘晓洋，等. 基于YOLO深度卷积神经网络的复杂背景下机器人采摘苹果定位[J]. 农业工程学报，2019，35(3)：164－173.

Zhao Dean, Wu Rendi, Liu Xiaoyang, et al. Apple positioning based on YOLO deep convolutional neural network for picking robot in complex background[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 164－173. (in Chinese with English abstract)

[29]薛月菊，黄宁，涂淑琴，等. 未成熟芒果的改进YOLOv2识别方法[J]. 农业工程学报，2018，34(7)：173－179.

Xue Yueju, Huang Ning, Tu Shuqin, et al.Immature mango detection based on improved YOLOv2[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(7): 173-179.(in Chinese with English abstract)

[30]Redmon J, Farhadi A. Yolov3: An incremental improvement[R]. arXiv, 2018.

[31]Huang G, Liu Z, Maaten L V D, et al. Densely Connected Convolutional Networks[C]//CVPR. IEEE Computer Society, 2017.

[32]Lin G, Tang Y, Zou X. et al. Fruit detection combined with color, depth, and shape information[J/OL]. Precision Agriculture. https://doi.org/10.1007/s11119-019-09654-w, 2019-06-29.

[33]刘挺，秦兵，张宇. 信息检索系统导论[M]. 北京：机械工业出版社，2008.

[34]Wu Shengli, McClean Sally. Lecture Notes in Computer Science[M]. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006.

Vision pre-positioning method for litchi picking robot under large field of view

Chen Yan, Wang Jiasheng, Zeng Zeqin, Zou Xiangjun※, Chen Mingyou

(1510642,; 2.,,510642,)

Litchi picking robot is an important tool for improving the automation of litchi picking operation. The spatial position information of litchi cluster needs to be acquired when the robot picks litchi normally. In order to guide the robot moving to the picking position and improve the picking efficiency, the vision pre-positioning method of litchi picking robot under large field of view is proposed in this paper studied. Firstly, using the binocular stereo vision system composed of two industrial cameras that have been calibrated, 250 pairs of litchi cluster images under large field of view was taken in the litchi orchard in Guangzhou, the spatial positions of key litchi clusters were recorded by using a laser range finder, and the results were compared with those tested in the paper. In order to expand the sample size, the original image and the polar line correction image were randomly cropped and scaled in a small range, and the final image data set was 4 000 sheets. After that, by using LabelImg, the data set of the target detection network was created. Secondly, by using the YOLOv3 network and the DenseNet classification network, combined with the characteristics of single target and single scene of litchi cluster detection task (only for orchard environment), the network structure was optimized, a Dense Module with a depth of 34 layers and a litchi cluster detection network YOLOv3-DenseNet34 based on the Dense Module was designed. Thirdly, Because of the the complexity of the background image under large field of view, the dense stereo matching degree of the whole image is low and the effect is poor, at the same time, some litchi clusters can not appear in the public view of the image at the same time, therefore, a method for calculating sub-pixel parallax was designed, peer-to-peer sequential consistency constraint matching method was proposed. By solving the quadratic curve composed of parallax and similarity, the parallax under sub-pixel was used to calculate the spatial positions of the litchi cluster. Through the comparison with the original network of YOLOv3, the test network performance of the paper was tested, and found that the YOLOv3-DenseNet34 network improved the detection accuracy and detection speed of the litchi cluster, the mAP (mean average precision) value was 0.943, the average detection speed was 22.11 frame/s and the model size was 9.3 MB, which was 1/26 of the original network of YOLOv3. Then, the detection results of the method were compared with the results of the laser range finder. The max absolute error of the pre-positioning at the detection distance of 3 m was 36.602 mm, the mean absolute error was 23.007 mm, and the average relative error was 0.836%. Test results showed that the vision pre-positioning method studied in this paper can basically meet the requirements of vision pre-positioning under large field of view in precision and speed. And this method can provide reference for other vision pre-positioning methods under large field of view of fruits and vegetables picking.

robs; image processing; object detection; litchi picking; large field of view; convolutional neural network; stereo vision

陈燕，王佳盛，曾泽钦，邹湘军，陈明猷. 大视场下荔枝采摘机器人的视觉预定位方法[J]. 农业工程学报，2019，35(23)：48－54.doi：10.11975/j.issn.1002-6819.2019.23.006 http://www.tcsae.org

Chen Yan, Wang Jiasheng, Zeng Zeqin, Zou Xiangjun, Chen Mingyou. Vision pre-positioning method for litchi picking robot under large field of view[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(23): 48－54. (in Chinese with English abstract) doi：10.11975/j.issn.1002-6819.2019.23.006 http://www.tcsae.org

2019-06-30

2019-11-11

国家自然科学基金资助项目（31571568）；广东省自然科学基金项目（2018A030307067）

陈燕，副教授，主要从事农业机器人、农业智能装备和智能设计与制造的研究，Email：cy123@scau.edu.cn

邹湘军，教授，博士生导师，主要从事农业机器人、机器视觉的研究，Email：xjzou1@163.com

10.11975/j.issn.1002-6819.2019.23.006

TP391.41

1002-6819(2019)-23-0048-07