基于融合坐标信息的改进YOLO V4模型识别奶牛面部

2021-11-26杨蜀秦刘杨启航韩媛媛王勇胜蓝贤勇

农业工程学报 2021年15期

杨蜀秦，刘杨启航，王振，韩媛媛，王勇胜，蓝贤勇

杨蜀秦1,2,3，刘杨启航1,2,3，王振1，韩媛媛1，王勇胜4，蓝贤勇5

（1. 西北农林科技大学机械与电子工程学院，杨凌 712100；2. 农业农村部农业物联网重点实验室，杨凌 712100；3. 陕西省农业信息感知与智能服务重点实验室，杨凌 712100；4. 西北农林科技大学动物医学院，杨凌 712100；5. 西北农林科技大学动物科技学院，杨凌 712100）

为实现奶牛个体的准确识别，基于YOLO V4目标检测网络，提出了一种融合坐标信息的奶牛面部识别模型。首先，采集71头奶牛面部图像数据集，并通过数据增强扩充提高模型的泛化性能。其次，在YOLO V4网络的特征提取层和检测头部分分别引入坐标注意力机制和包含坐标通道的坐标卷积模块，以增强模型对目标位置的敏感性，提高识别精度。试验结果表明，改进的YOLO V4模型能够有效提取奶牛个体面部特征，平均精度均值为93.68%，平均帧率为18帧/s，虽然检测速度低于无锚框的CenterNet，但平均精度均值提高了10.92%；与Faster R-CNN和SSD模型相比，在检测速度提高的同时，精度分别提高了1.51和16.32个百分点；与原始YOLO V4相比，mAP提高0.89%，同时检测速度基本不变。该研究为奶牛精准养殖中的牛脸图像识别提供了一种有效的技术支持。

图像识别；动物；奶牛面部；YOLO V4；注意力机制；坐标卷积

0 引言

精准养殖是现代智慧牧业发展的重要方向之一[1-3]。奶牛精准养殖中，奶牛个体身份识别是实现智能化和规模化养殖的前提[4-7]，其能够为个体饲喂方案制定、产奶效能和健康状况分析提供基础信息[8]，也成为奶品溯源、防疫防病和保险理赔等管理工作的重要环节[2]。

传统奶牛身份识别以耳标、烙印、颈链和刺标等人工观测方法[9]为主，这些方法不仅费时费力，且易引发应激反应，造成奶牛和人员损伤。将无线射频技术（Radio Frequency Identification, RFID）应用于奶牛的个体识别，可根据编号追踪奶牛从出生到被屠宰的全部信息[10-11]，但是其在耐用性和成本上还存在缺陷。此外，有学者采用牛鼻镜纹路、虹膜、视网膜血管等生理特征对奶牛个体进行识别[12-14]，但由于获取这些特征在实际操作时不便采集，因而影响了方法的推广性[15]。

随着养殖场摄像设备的普及，越来越多的研究基于计算机视觉技术开展家畜个体身份和行为识别工作[16-19]。例如Cai等[20]使用局部二值模式提取纹理特征建立面部描述模型对牛脸进行识别。随着深度学习的发展，Faster R-CNN[21]、SSD[22]、YOLO系列[23-25]和CenterNet[26]等目标检测网络的提出有助于进一步开展奶牛图像识别。例如，通过提取奶牛的躯干特征，赵凯旋等[27]利用卷积神经网络的方法对30头奶牛进行识别；文献[28]和[29]以奶牛背部纹理作为特征，分别基于YOLO V3和RCNN模型识别奶牛个体；姚礼垚等[30]构建了一个超过10 000张不同条件下的奶牛数据集，对比分析了几种深度学习方法在牛脸检测中的性能，但该研究只检测牛脸，并未进行个体识别。

综上所述，前人采用人工特征识别奶牛面部的方法对数据集收集操作要求较高，在复杂条件下，当奶牛个体面部特征出现变化时存在识别精度低等问题。因此，为实现非接触、低成本和高效率的奶牛个体识别，本文构建了包含多种姿态的71头荷斯坦奶牛面部图像数据集，基于YOLO V4[31]模型，在特征图中融入坐标信息，以增加模型对奶牛位置的敏感度，从而提高奶牛面部识别的准确性和快速性，拟为奶牛个体识别提供了一种有效的技术支持。

1 材料与方法

1.1 奶牛图像采集

奶牛图像拍摄于陕西省咸阳市杨凌区科元克隆股份有限公司。采用索尼FDR-AX100E摄像机分别于2019年1月20日、2020年10月14日和2021年1月17日对实际场景下的71头美国荷斯坦奶牛跟踪拍摄，每段视频时长约1 min，帧率为30帧/s，分辨率为1 440像素×1 080像素。数据集中包括育成期、青年期、干奶期和泌乳期等不同生长阶段、不同光照条件、不同姿态和不同遮挡程度的奶牛。采集到的奶牛面部图像有全黑和黑白相间的2种颜色类型，其中数据集中纯黑奶牛数据较少，面部黑白相间的奶牛数据占主要部分，如图1所示。

1.2 数据集构建

将拍摄的视频按15帧/s截取并剔除模糊、遮挡严重、光线不足等图像，共获得71类6 486幅奶牛面部图像，其中90%划分为训练集，剩余10%作为测试集。同时，为增强识别模型的鲁棒性，结合实际拍摄时会存在倾斜角度、明暗程度和分辨率不同等情况，对原始训练集图像采用−10°到10°旋转、随机亮度调整和裁剪的数据增强方法进行扩充，最终得到16 614幅训练集图像，其中包括2019年和2020年的10 940幅图像以及2021年拍摄的部分5 674幅图像，剩余2021年的649幅图像作为测试集。使用LabelImg图像标注工具对训练集和测试集的奶牛图像进行标注。

2 研究方法

2.1 YOLO V4模型

目标检测是从图像中完成准确快速识别和定位物体的任务。YOLO目标识别算法将分类、定位、检测功能融合到一个网络当中，只需要将图像输入网络，就可同时得到目标的位置和类别。该网络将检测任务视作回归问题，兼具良好的检测速度与精度。

YOLO V4目标检测网络是在YOLO V3基础上，对骨干特征提取网络、特征融合的颈部网络和分类回归的预测输出部分进行改进。在检测过程中，YOLO V4将输入的图像划分为不同大小的网格，当物体中心坐标落在某个网格中时，由该网格负责检测目标。在骨干特征提取网络方面，YOLO V4引入了跨阶段局部网络（Cross Stage Partial Network, CSPNet）[32]的思想构造CSPDarknet53结构，在颈部网络部分，YOLO V4引入了空间金字塔池化模块（Spatial Pyramid Pooling, SPP）[33]和路径聚合网络（Path Aggregation Network, PANet）[34]。SPP模块对CSPDarkNet53网络最后输出的特征层利用不同大小的池化核进行最大池化来提高感受野，将上下文中的重要特征提取出来。PANet能够将特征信息从下至上传递，融合了丰富的特征信息，避免信息丢失。

2.2 融合坐标信息的YOLO V4模型构建

奶牛面部识别与牛脸在图像中的位置密切相关，准确定位牛脸有助于提高奶牛个体识别的准确率。本文在YOLO V4模型的骨干特征提取网络和检测头部分分别添加坐标注意力模块（Coordinate Attention）[35]和坐标卷积模块（CoordConv）[36]，从2方面提升奶牛面部检测精度。图2是改进的YOLO V4奶牛面部检测网络结构图。

2.2.1 骨干网络中添加坐标注意力

在特征提取部分，由于高分辨特征图对位置敏感性更高。因此，为保留奶牛图像位置信息，可将坐标注意力添加到CBM（Convolution-Batch Normalization-Mish）模块之后，该模块由卷积层、批量正则化和Mish[37]激活函数构成。

坐标注意力模块通过精确的位置信息对通道关系和长程依赖进行编码，分为坐标信息嵌入和坐标注意力生成2个步骤。首先，对输入图像使用不同尺寸的池化核沿着水平和垂直坐标方向对每个通道进行解码，沿着2个空间方向进行特征聚合后得到一对方向感知的特征图。每个特征图都沿一个空间方向捕获输入特征图的长程依赖关系，实现可将位置信息保存在生成的特征图中。然后通过乘法将2个特征图均用于输入特征图中，以强调注意区域的表示。

2.2.2 坐标卷积模块

如图2所示，在检测头部分，将坐标信息构成的二维通道特征与高层语义特征堆叠，增强检测头中的位置敏感性。坐标卷积的核心是在卷积层之前显式加入坐标特征，对卷积层和位置信息结合进行计算。其对卷积层的输入数据加入了2个通道来分别表示特征图的横纵坐标值，对于横坐标特征通道，每一个像素点代表横坐标的数值，其第一行填充为0，第二行填充为1，第三行填充为2，以此类推；纵坐标通道与此类似，并进行归一化操作。保留特征之间的相对位置，便于卷积访问特征坐标信息。

2.3 模型训练

2.3.1 训练参数设置

可以说,奥林匹克文化深刻地影响了希腊欧洲的哲学变革，引导着哲学家们向人的原始冲动中寻求答案。在此,笔者仅举两例：

硬件环境为GeForce RTX 2080Ti GPU，显存为12 GB，操作系统为Ubuntu 16.04，使用Pytorch深度学习框架构建模型。在参数设置方面，训练图像尺寸设为416像素×416像素，模型训练100个epoch，试验中每完成一个epoch，保存一次权重参数，当模型训练结束后一共有100个模型权值参数，以评价训练模型的性能，批尺寸设置为8。

本文利用迁移学习的思想，通过预训练模型提高训练速度，首先在前50次训练冻结骨干网络参数对分类器进行训练，学习率设置为1×10-3，较大的学习率可快速更新参数，加速收敛。在后50次训练中，对骨干网络参数解冻，学习率设置为1×10-4，此时通过设置较小的学习率对网络进行微调，逐步逼近最优解。

2.3.2 评价指标

本文采用平均精度值（Average Precision，AP）、平均精度均值（Mean Average Precision，mAP）和平均帧率（Frame Per Second，FPS）作为模型的评价指标，其中FPS指1 s内识别的图像数。AP是以召回率（Recall，）为横轴，精准率（Precision，）为纵轴绘制-曲线并对其积分求出曲线下的面积得到，mAP是对每个类别的AP值求和然后取平均。其计算公式如（1）和（2）所示。

式中()为曲线函数表达，代表奶牛的类别数，AP代表第类奶牛的平均精度值。

3 结果与分析

从不同模型的试验结果以及对面部遮挡奶牛的识别结果2个方面，验证并分析改进模型对奶牛面部识别的性能。

3.1 不同模型的结果对比

采用SSD模型、CenterNet模型、Faster R-CNN模型、YOLO V4模型、在特征提取网络增加坐标注意力的CA-YOLO V4模型和改进YOLO V4模型对测试集图像进行识别，结果如表1所示。

表1 不同模型对奶牛个体的识别结果

可以看出，本研究提出的改进的YOLO V4奶牛面部识别模型识别效果优于其他模型。其中，CA-YOLO V4是添加坐标注意力模块的YOLO V4模型，相较于原始YOLO V4模型的mAP提升了0.7个百分点。在此基础上，本文添加坐标注意力和坐标卷积模块的改进模型相对于CA-YOLO V4模型进一步有效提升了奶牛数据集检测结果，得到更高的检测精度，mAP比CA-YOLO V4提高了0.19%；比原始YOLO V4模型提高了0.89个百分点；与CenterNet模型相比，虽然检测速度低，但是mAP值提高了10.92个百分点；与Faster R-CNN和SSD模型相比，在检测速度提高的同时，精度分别提高了1.51和16.32个百分点。也就是说，在特征提取网络中添加坐标注意力和检测头添加坐标位置信息能够有效提升奶牛面部的识别精度，从而验证了本文提出方法的有效性。

不同模型的部分检测结果对比如图3所示。可以看出，SSD、CenterNet、Faster R-CNN与YOLO V4模型出现了漏检或错检的情况。例如，由于15083号和17125号奶牛面部黑色斑纹面积较大，与身体部分相似，模型提取的面部特征信息不充足，因此出现SSD模型对17125号奶牛以及CenterNet模型对15083号奶牛出现漏检情况；另外，由于面部斑纹分布相似，Faster R-CNN模型将14188号奶牛误识别为18044号奶牛，YOLO V4模型将17060号奶牛误识别为18051号奶牛。而本文模型对上述4类奶牛均得出了正确的识别结果。

3.2 不同模型对面部遮挡奶牛的识别结果

拍摄过程中，受奶牛养殖场环境的影响，当奶牛在围栏内部活动时，采集到的奶牛面部图像易受到不同程度遮挡。本文选用了10类奶牛共计120幅面部遮挡图像进行测试，试验结果如表2所示。

由表2可知，本文模型对于10类奶牛遮挡图像的AP值均高于或等于其他4种模型，其mAP达92.60%。比CenterNet和SSD模型分别提高了30.52和12.79个百分点。相较于YOLO V4模型，mAP提高了10.95个百分点，其中20093号、20098号和20121号奶牛的AP值分别提升了40、9和44个百分点；相较于Faster R-CNN模型，mAP提高了7.91个百分点，其中17107号、20104号和20121号奶牛的AP值分别提升了26、14和24个百分点。上述结果表明，在特征提取网络和检测头中融入坐标信息有助于改进YOLO V4模型应对面部遮挡的挑战。

表2 不同模型对面部遮挡奶牛的识别精度

注：表中每只奶牛编号对应的张数为该奶牛面部遮挡样本图像的数量。

Note：The number corresponding to each cow is the number of its face occlusion sample images.

3.3 讨论

1）奶牛面部纹理对识别结果的影响

提出的YOLO V4改进模型对奶牛面部识别的准确率达到93.68%，与CA-YOLO V4、YOLO V4、CenterNet、Faster R-CNN、SSD模型相比，在精度上有所提升，但是仍然存在一些识别问题。改进的YOLO V4模型对部分面部黑白相间的奶牛未能正确识别，其中单头奶牛错误识别情况如图4所示，图4a中18044号奶牛被错误识别成15036号奶牛。由图4b的15036号奶牛面部图像可见，其误识别原因是由于两类奶牛面部花斑极为相似，缺少能够显著区分两者的特征，因此导致识别错误。

2）奶牛面部遮挡对识别结果的影响

如图5所示，17107号和20121号奶牛均出现严重的遮挡情况，导致特征提取的有效信息不充足，同时，这两类奶牛面部颜色基本为黑色，面部轮廓与身体黑色部分相似，进一步增加了特征提取困难，从而使得改进的YOLO V4模型难以准确识别。

4 结论

奶牛面部识别是奶牛智慧养殖中个体精准饲喂和行为理解的重要前提。为实现快速、准确和非接触式奶牛面部识别，本文将坐标信息融入YOLO V4目标检测模型中，通过提高模型对奶牛定位的敏感性增强模型的识别性能。主要结论如下：

1）通过在特征提取层添加嵌入位置信息的注意力模块，并在检测头中添加坐标卷积模块，提出一种改进的YOLO V4模型用于奶牛个体面部识别。改进的模型在测试集中的平均精度均值达到93.68%，分别比SSD模型、CenterNet模型、YOLO V4模型、Faster R-CNN模型和CA-YOLO V4模型提高16.32、10.92、0.89、1.51和0.19个百分点；在检测速度方面，改进的YOLO V4模型略低于原始YOLO V4模型，但对奶牛面部识别精度更高。试验结果表明，改进的YOLO V4模型增强了奶牛面部定位的位置敏感性，进一步提高了识别精度。

2）改进的YOLO V4模型对于奶牛面部遮挡情况的识别效果优于SSD模型、CenterNet模型、YOLO V4模型和Faster R-CNN模型，识别率达到了92.60%，分别比SSD模型、CenterNet模型、YOLO V4模型和Faster R-CNN模型提高12.79、30.52、10.95和7.91个百分点，但对于遮挡面积大、光线过暗导致面部特征不明显等方面的识别精度仍有待提高。

[1] Matthews S G, Miller A L, Clapp J, et al. Early detection of health and welfare compromises through automated detection of behavioural changes in pigs[J]. The Veterinary Journal, 2016, 217: 43-51.

[2] 何东健，刘冬，赵凯旋. 精准畜牧业中动物信息智能感知与行为检测研究进展[J]. 农业机械学报，2016，47(5)：231-244.

He Dongjian, Liu Dong, Zhao Kaixuan. Review of perceiving animal information and behavior in precision livestock farming[J]. Transactions of the Chinese Society for Agricultural Machinery, 2016, 47(5): 231-244. (in Chinese with English abstract)

[3] Santoni M M, Sensuse D I, Arymurthy A M, et al. Cattle race classification using gray level co-occurrence matrix convolutional neural networks[J]. Procedia Computer Science, 2015, 59: 493-502.

[4] Tsai D M, Huang C Y. A motion and image analysis method for automatic detection of estrus and mating behavior in cattle[J]. Computers and Electronics in Agriculture, 2014, 104: 25-31.

[5] Kumar S, Pandey A, Kondamudi S, et al. Deep learning framework for recognition of cattle using muzzle point image pattern[J]. Measurement, 2018, 116: 1-17.

[6] 何东健，刘畅，熊虹婷. 奶牛体温植入式传感器与实时监测系统设计与试验[J]. 农业机械学报，2018，49(12)：195-202.

He Dongjian, Liu Chang, Xiong Hongting. Design and experiment of implantable sensor and real-time detection system for temperature monitoring of cow[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49(12): 195-202. (in Chinese with English abstract)

[7] 刘忠超，何东健. 基于卷积神经网络的奶牛发情行为识别方法[J]. 农业机械学报，2019，50(7)：186-193.

Liu Zhongchao, He Dongjian, Recognition method of cow estrus behavior based on convolutional neural network[J]. Transactions of the Chinese Society for Agricultural Machinery, 2019, 50(7): 186-193. (in Chinese with English abstract)

[8] 刘忠超，翟天嵩，何东健. 精准养殖中奶牛个体信息监测研究现状及进展[J]. 黑龙江畜牧兽医，2019(13)：30-33，38.

Liu Zhongchao, Zhai Tiansong, He Dongjian. Research status and progress of individual information monitoring of dairy cows in precision breeding[J]. Heilongjiang Animal Science and Veterinary Medicine, 2019(13): 30-33, 38. (in Chinese with English abstract)

[9] 蒋国滨. 奶牛个体识别的标记方法[J]. 饲料博览，2018(5)：86.

[10] 孙雨坤，王玉洁，霍鹏举，等. 奶牛个体识别方法及其应用研究进展[J]. 中国农业大学学报，2019，24(12)：62-70.

Sun Yukung, Wang Yujie, Huo Pengju, et al. Research progress on methods and application of dairy cow identification[J]. Journal of China Agricultural University, 2019, 24(12): 62-70. (in Chinese with English abstract)

[11] 耿丽微，钱东平，赵春辉. 基于射频技术的奶牛身份识别系统[J]. 农业工程学报，2009，25(5)：137-141.

Geng Liwei, Qian Dongping, Zhao Chunhui. Cow identification technology system based on radio frequency[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2009, 25(5): 137-141. (in Chinese with English abstract)

[12] Barry B, Gonzales-Barron U, Butler F, et al. Using muzzle pattern recognition as a biometric approach for cattle identification[J]. Transactions of the ASABE, 2007, 50(3): 1073-1080.

[13] Lu Y, He X, Wen Y, et al. A new cow identification system based on iris analysis and recognition[J]. International Journal of Biometrics, 2014, 6(1): 18-32.

[14] Allen A, Golden B, Taylor M, et al. Evaluation of retinal imaging technology for the biometric identification of bovine animals in Northern Ireland[J]. Livestock Science, 2008, 116(1): 42-52.

[15] 许贝贝，王文生，郭雷风，等. 基于非接触式的牛只身份识别研究进展与展望[J]. 中国农业科技导报，2020，22(7)：79-89.

Xu Beibei, Wang Wensheng, Guo Leifeng, et al. A review and future prospects on cattle recognition based on non-contact identification[J]. Journal of Agricultural Science and Technology, 2020, 22(7): 79-89. (in Chinese with English abstract)

[16] 燕红文，刘振宇，崔清亮，等. 基于特征金字塔注意力与深度卷积网络的多目标生猪检测[J]. 农业工程学报，2020，36(11)：193-202.

Yan Hongwen, Liu Zhenyu, Cui Qingliang, et al. Multi-target detection based on feature pyramid attention and deep convolution network for pigs[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(11): 193-202. (in Chinese with English abstract)

[17] 胡志伟，杨华，娄甜田. 采用双重注意力特征金字塔网络检测群养生猪[J]. 农业工程学报，2021，37(5)：166-174.

Hu Zhiwei, Yang Hua, Lou Tiantian. Instance detection of group breeding pigs using a pyramid network with dual attention feature[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(5): 166-174. (in Chinese with English abstract)

[18] 蔡骋，宋肖肖，何进荣. 基于计算机视觉的牛脸轮廓提取算法及实现[J]. 农业工程学报，2017，33(11)：171-177.

Cai Cheng, Song Xiaoxiao, He Jinrong. Algorithm and realization for cattle face contour extraction based on computer vision[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(11): 171-177. (in Chinese with English abstract)

[19] 宋怀波，牛满堂，姬存慧，等. 基于视频分析的多目标奶牛反刍行为监测[J]. 农业工程学报，2018，34(18)：211-218.

Song Huaibo, Niu Mantang, Ji Cunhui, et al Monitoring of multi-target cow ruminant behavior based on video analysis technology[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(18):211-218. (in Chinese with English abstract)

[20] Cai C, Li J. Cattle face recognition using local binary pattern descriptor[C]//Signal and Information Processing Association Annual Summit and Conference, 2013: 1-4.

[21] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.

[22] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multiBox detector[C]// European Conference on Computer Vision. 2016: 21-37.

[23] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 779-788.

[24] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 6517-6525.

[25] Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.

[26] Zhou X, Wang D, Krhenbühl P. Objects as points[J]. arXiv preprint arXiv: 1904.07850, 2019.

[27] 赵凯旋，何东健. 基于卷积神经网络的奶牛个体身份识别方法[J]. 农业工程学报，2015，31(5)：181-187.

Zhao Kaixuan, He Dongjian. Recognition of individual dairy cattle based on convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(5): 181-187. (in Chinese with English abstract )

[28] 何东健，刘建敏，熊虹婷，等. 基于改进YOLO v3模型的挤奶奶牛个体识别方法[J]. 农业机械学报，2020，51(4)：250-260.

He Dongjian, Liu Jianmin, Xiong Hongting, et al. Individual identification of dairy cows based on improved YOLO v3[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51(4): 250-260. (in Chinese with English abstract)

[29] Andrew W, Greatwood C, Burghardt T. Visual localisation and individual identification of holstein friesian cattle via deep learning[C]// Proceedings of the IEEE International Conference on Computer Vision Workshops. 2017: 2850-2859.

[30] 姚礼垚，熊浩，钟依健，等. 基于深度网络模型的牛脸检测算法比较[J]. 江苏大学学报：自然科学版，2019，40(2)：197-202.

Yao Liyao, Xiong Hao, Zhong Yijian et al. Comparison of cow face detection algorithms based on deep network model[J]. Journal of Jiangsu University: Natural Science Edition, 2019, 40(2): 197-202. (in Chinese with English abstract)

[31] Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv: 2004.10934, 2020.

[32] Wang C Y, Liao H Y M, Yeh I H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[J]. arXiv preprint arXiv: 1911.11929, 2019.

[33] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.

[34] Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 8759-8768.

[35] Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[J]. arXiv preprint arXiv: 2103.02907, 2021.

[36] Liu R, Lehman J, Molino P, et al. An intriguing failing of convolutional neural networks and the coordconv solution[J]. arXiv, preprint arXiv: 1807.03247, 2018.

[37] Misra D. Mish: A self regularized non-monotonic neural activation function[J]. arXiv preprint arXiv: 1908.08681, 2019.

Improved YOLO V4 model for face recognition of diary cow by fusing coordinate information

Yang Shuqin1,2,3, Liu Yangqihang1,2,3, Wang Zhen1, Han Yuanyuan1, Wang Yongsheng4, Lan Xianyong5

(1.,,712100,; 2.,,712100,; 3.,712100,; 4.,,712100,; 5.,,712100,)

Individual identity identification of dairy cows is one of the most prerequisites for the intelligent, precision, and large-scale breeding of dairy cows. It can also provide basic information for the formulation of individual feeding plans, milk production efficiency, and health status analysis. As such, an important link can serve in the management of milk source traceability, disease prevention, and insurance claim settlement. Traditional artificial identification of cows, such as ear tags, brands, neck chains, and pricks, is time-consuming and laborious, particularly easy to cause a stress response, resulting in injuries to cows and people. Current identification using Radio Frequency Identification (RFID) or some physiological characteristics, such as bovine nose mirror lines, iris, retinal blood vessels, still have some defects in durability, cost, and accessibility. In this study, a cow face identification was proposed to fuse with the coordinate information using an improved YOLO V4 detection model, in order to identify individual dairy cows accurately and nondestructively. Holstein cow was also taken as a research object. First, 71 facial images were collected in an actual dairy farm over three years, including the cows with different growth stages, various lighting conditions, postures, and degrees of occlusion. A preprocessing step was also selected to remove the blurry, severe occlusion, insufficient light, and abnormal images. The preprocessed dataset was enhanced and then expanded by -10° to 10° rotation, random brightness adjustment, and cropping, thereby improving the generalization performance of the model. In total, 16 614 images of the training set were obtained, including 10 940 images in 2019 and 2020, and some 5 674 images taken in 2021, where the remaining 649 images in 2021 were used as the test set. Secondly, the coordinate attention and coordinate convolution module (CoordConv) containing the coordinate channel were introduced into the feature extraction layer and detection head part of the YOLO V4 network, particularly for the model sensitivity of target location. Finally, the improved YOLO V4 model was compared with 5 object detection models to verify the effectiveness. The test results showed that the average accuracy of the improved YOLO V4 model was 93.68%. Specifically, the new model was improved by 16.32, 10.92, 0.89, 1.51 and 0.19 percentage points, respectively, compared with SSD, CenterNet, YOLO V4, Faster R-CNN, and CA-YOLO V4 model. The improved YOLO V4 model was slightly lower than the original YOLO V4 model, in terms of detection speed. Furthermore, better recognition performance was achieved for the cows with the face occlusion in the improved YOLO V4 model than others. The recognition rate reached 92.60%, the new model was 12.79, 30.52, 10.95 and 7.91 percentage points higher than that of SSD, CenterNet, YOLO V4, and Faster R-CNN model, respectively. Nevertheless, it was necessary to enhance the recognition accuracy, when the facial features were not obvious leading by large occlusion area and dark light. Consequently, the experiment demonstrated that the coordinate information greatly contributed to enhancing the position sensitivity of the cow face for a higher recognition accuracy in the improved YOLO V4 model. This finding can provide effective technical support to identify the cow face in precise dairy cow breeding.

image recognition; animals; dairy cow face; YOLO V4; attentional mechanism; coordinate convolution

杨蜀秦，刘杨启航，王振，等. 基于融合坐标信息的改进YOLO V4模型识别奶牛面部[J]. 农业工程学报，2021，37(15)：129-135.doi：10.11975/j.issn.1002-6819.2021.15.016 http://www.tcsae.org

Yang Shuqin, Liu Yangqihang, Wang Zhen, et al. Improved YOLO V4 model for face recognition of diary cow by fusing coordinate information[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(15): 129-135. (in Chinese with English abstract) doi：10.11975/j.issn.1002-6819.2021.15.016 http://www.tcsae.org

2021-06-04

2021-07-21

陕西省农业科技创新转化项目（NYKJ-2020-YL-07）

杨蜀秦，博士，副教授，研究方向为计算机视觉在农业信息领域中的应用。Email：yangshuqin1978@163.com

10.11975/j.issn.1002-6819.2021.15.016

TP391.4

1002-6819(2021)-15-0129-07