融合超分辨率重建的YOLOv5松枯死木识别模型
2023-05-15王文瑾游子绎邵历江李小林吴松青张珠河黄世国张飞萍
王文瑾,游子绎,邵历江,李小林,吴松青,3,张珠河,黄世国,3,张飞萍,3
融合超分辨率重建的YOLOv5松枯死木识别模型
王文瑾1,游子绎2,邵历江1,李小林1,吴松青2,3,张珠河4,黄世国1,3※,张飞萍2,3
(1. 福建农林大学计算机与信息学院,福州 350002;2. 福建农林大学林学院,福州 350002;3. 生态公益重大有害生物防控福建省高校重点实验室,福州 350002;4. 福州市森林病虫害防治检疫站,福州 350002)
为解决山地地形起伏大、无人机飞行高度高导致图像中尺度小且纹理模糊的松枯死木识别困难问题,该研究提出了一种在特征层级进行超分辨率重建的YOLOv5松枯死木识别算法。在YOLOv5网络中添加选择性核特征纹理迁移模块生成有细节纹理的高清检测特征图,自适应改变感受野的机制分配权重,将更多注意力集中在纹理细节,提升了小目标和模糊目标的识别精度。同时,使用前景背景平衡损失函数抑制背景噪声干扰,增加正样本的梯度贡献,改善正负样本分布不平衡问题。试验结果表明,改进后算法在交并比(intersection over union, IoU)阈值取0.5时的平均精度均值(mean average precision, mAP50)为92.7%,mAP50~95(以步长0.05从0.5到0.95间取IoU阈值下的平均mAP)为62.1%,APsmall(小目标平均精度值)为53.2%,相比于原算法mAP50提高了3.2个百分点,mAP50~95提升了8.3个百分点,APsmall提升了15.8个百分点。不同算法对比分析表明,该方法优于Faster R-CNN、YOLOv4、YOLOX、MT-YOLOv6、QueryDet、DDYOLOv5等深度学习算法,mAP50分别提高了16.7、15.3、2.5、2.8、12.3和1.2个百分点。改进后松枯死木识别算法具有较高精度,有效缓解了小目标与纹理模糊目标识别困难问题,为后续疫木清零提供技术支持。
无人机;图像识别;松枯死木;小目标检测;超分辨率重建;特征融合
0 引 言
松材线虫是一种严重危害松树的检疫性有害生物,中国6 000万hm²松林正面临松材线虫病大流行的威胁[1]。松枯死木的快速监测与识别是松材线虫病防控的关键。传统人工目测及地面普查耗时费力,实施较为困难。近年来,利用无人机快速高效收集林分图像的方法已表现出大规模高效监测松材线虫病发生区域的应用潜力[2]。因此,从影像中快速准确地识别松枯死木成为研究热点。
传统松枯死木识别算法主要基于颜色、形状、纹理等特征手工设计特征描述子进行识别或分割[3-5],较难提取具有代表性的语义信息[6]。近年来,受益于数字图像和人工智能技术的发展,基于深度学习的算法在农林业领域展现出良好的识别能力。一些学者对常用目标算法进行参数调优或改进,如GoogLeNet[7]、Faster R-CNN[8-10]、YOLOv3[11-13]、SSD[14]、YOLOv4[10,15-17]等算法,同时,LI等[18]也将YOLOv5算法用于无人机图像与卫星图像的松枯死木识别预测,HU等[19]使用高效通道注意力(efficient channel attention, ECA)和混合扩张卷积改进YOLOv5算法进行松枯死木识别。在林业生产中,由于地形起伏大,无人机只能以较高高度飞行(如800 m以上)才能满足大规模监测的要求。这种情况下拍摄的松林影像中目标像素占比小,分辨率低,识别困难[20]。同时在高郁闭度林分的影像中存在枝叶遮挡、过曝或逆光的目标,边界和纹理均较模糊,提取特征过程中纹理特征容易丢失,误检严重[15]。但目前的研究主要集中在飞行高度500 m以下,甚至是飞行高度100 m左右无人机拍摄的林分影像研究,这种高度很难满足实际生产的要求。因此,迫切需要研究满足林业生产需求的松枯死木识别技术。
超分辨率(super-resolution, SR)重建技术旨在通过特定算法从低质量模糊图片中重建相应的高分辨率图像,包含插值法、重构法和基于学习的方法[21],目前基于深度学习的方法是当前的研究热点。DONG等[22]提出了第一个基于卷积神经网络的SR深度学习网络,经过双三次插值放大后再用卷积网络恢复图像,效果优于之前的经典算法。随后越来越多神经网络变体被引入到超分辨率重建任务,如基于生成对抗网络的SR网络[23]、基于Transformer的SR网络[24]等进一步提升重建效果。近年来,诸多深度学习SR重建网络被提出为上游视觉任务提供帮助[25-27]。研究表明通过提高图像分辨率来增加图像信息量,可以有效恢复小目标和纹理模糊目标细节,提升目标识别的性能[28-30],但输入大尺寸图像使网络计算量显著增加[31-32]。因此,一些学者直接对特征层级进行超分辨率操作,在保证运行速度的情况下提高浅层特征的表达能力。NOH等[33]使用生成对抗网络的思想训练超分辨率网络,将高分辨率特征用作超分辨率网络的直接监督信号,与低分辨率特征感受野匹配后进行配对训练,但在重建低分辨率特征时没有考虑浅层特征与深层特征的上下文联系,生成的特征不稳定。DENG等[34]提出了一种基于特征纹理迁移的特征超分辨率方法,用知识蒸馏将额外扩展的特征金字塔(feature pyramid networks, FPN)中的更高分辨率图像信息作为监督,从而生成高清特征图用于识别,但默认用固定权重分配特征,导致网络的表达能力有限,计算资源浪费。
综上所述,现有融合超分辨率重建的目标识别方法使用固定权重融合特征或未考虑上下文联系,难以精准识别小目标和纹理模糊目标。为此,本文在YOLO系列中识别性能优秀的YOLOv5[35]算法上进一步改进,提出融合超分辨率重建的YOLOv5松枯死木识别模型。
1 材料与方法
1.1 松枯死木数据集
1.1.1 试验区概况
本研究中使用的无人机图像空间范围为25°20′N~26°18′N,118°29′E~119°31′E,覆盖福建省福州市闽侯县和莆田市仙游县共15 400 hm²,其中闽侯县白沙镇面积约7 467 hm²,鸿尾乡面积约5 333 hm²,竹岐乡面积约1 133 hm²,甘蔗街道面积约933 hm²,仙游县西天尾镇面积约533 hm²。闽侯县气候类型为中亚热带季风气候,境内年平均气温19.5 ℃,年平均降雨量约1 673.9 mm,仙游县属于南亚热带海洋性季风气候,年平均气温20.6 ℃,年平均降雨量约1 300~2 300 mm,全年温暖湿润,无霜期长。研究区地貌类型多为山地丘陵,地形起伏较大,海拔高度在400~1 200 m不等,但大部分处在800 m以上。林业用地土壤以红壤为主,小部分为山地黄壤,林地成分多为马尾松天然林,其次为马尾松和阔叶混交林。
1.1.2 图像采集与预处理
为满足作业面积及飞行高度等需求,研究选用CW-007型固定翼无人机采集无人机影像,该机搭载CA-102型4 200万像素相机。依据大量晚期变色松枯死木出现的时间,于2020年11月与2021年10月进行试验数据采集。受地形限制,飞行高度为800~1 200 m不等。将拍摄好的分辨率为7 952×5 304像素的影像用Pix4Dmapper软件进行拼接,得到39张TIF图像,图像大小介于31 984×26 045~64 033×50 719像素。
将拼接获得的影像首先按600×600像素进行裁剪,得到8 978张子图像。利用图像信息熵来衡量图像成像质量,将信息熵小于5的空白图像和边缘图像剔除。部分数据如图1所示,采集场景包含晴天、多云、黄昏等,保留部分过曝与逆光图像增加样本多样性,借此提升模型鲁棒性。对剔除后的7 923张图像进行标注,实地验证不确定样本13 581棵,共得到松枯死木样本29 250棵。标注后的数据集按COCO格式存储,并将该数据集按8∶1∶1划分为训练集、验证集和测试集。从图2a可以看出,密集地区的目标边缘轮廓模糊,同时小尺寸的松枯死木分辨率低、纹理模糊,从图2b标签框大小分布来看,左下角点出现聚集,说明数据中存在大量属于小目标的松枯死木。
a. 晴天a. Sunb. 多云b. Cloudyc. 黄昏c. Twilightd. 过曝d. Overexposured. 逆光d. Backlight
图2 训练集实例与标签框大小分布
1.2 融合超分辨率重建的YOLOv5松枯死木识别模型
为解决无人机获取的林分影像中小目标和纹理模糊目标造成的识别精度低问题,研究选用YOLOv5-6.0版本的YOLOv5x作为基础模型,提出了一种融合超分辨率重建的YOLOv5松枯死木识别算法(图3)。该算法的核心思想是利用超分辨率模块融合多级特征生成高分辨率特征图来提升目标识别精度。对YOLOv5算法具体做了如下改进:1)提出了选择性核特征纹理迁移模块,用于特征层级的超分辨率重建,使用自注意力机制选择性地融合各尺度特征生成高分辨率特征信息,使网络适应性更强。2)更改颈部网络,使用SKFTT替换原有网络的UpSample上采样和Concat拼接操作,用加入语义后的浅层特征图识别小目标与纹理模糊目标,同时将其向下扩展一层用于更小目标的识别。3)使用FB Loss损失函数,抑制背景噪声干扰,提升对正样本关注度,加强对目标区域的学习。
图3 融合超分辨率重建的YOLOv5松枯死木识别网络
1.2.1 选择性核特征纹理迁移模块
浅层特征中的噪声通常会随着主干网络的不断卷积向下传递,淹没目标区域的关键信息,得到的特征图分辨率也越来越低。本研究提出SKFTT模块,旨在提取并强化目标区域的关键信息,从而提升目标识别精度。
SKFTT模块由内容提取器、纹理提取器与选择性核特征融合器组成(图4)。其主要输入经过内容提取器提取到语义内容信息,然后经过亚像素卷积强化信息得到两倍尺寸分辨率特征图′,为获取更丰富的上下文信息,再从参考特征层L中提取需要的纹理信息L′,最后将′和L′一起送入选择性核特征融合器,通过自注意力机制将更多有效权重分配给浅层高分辨率特征图,加强目标细节纹理信息在网络中的传递。
内容提取器与纹理提取器均采用残差连接的方式提取特征。首先使用卷积层、批量归一化(batch normalization, BN)和线性整流激活层(rectified linear unit, ReLU)的堆叠分别提取强语义与高分辨率信息,然后将输入与输出层相连,避免网络过深造成梯度消失,实现信息的完整传递。最后,纹理提取器经过卷积层强化特征,内容提取器使用亚像素卷积提升分辨率,生成高分辨率特征。
选择性核特征融合器依据人类视觉神经元可以根据刺激而改变感受野的机制,通过融合操作符和选择操作符动态调整接受的信息。在融合运算部分,先将具有丰富内容和纹理信息的特征图聚合,再通过全局平均池化获取全局信息,经过卷积和激活后得到降尺度的紧凑特征,由两个并行的卷积实现通道扩增后得到两个特征描述符1和2。在选择运算部分,将更多的权重分配给浅层高分辨率特征图,使用softmax激活函数作用于1、2得到选择权重矩阵1、2,最后由1、2自适应地完成多尺度特征图加权操作,捕捉更多有效特征,增强小目标与纹理模糊目标的细节特征。
注:L为主要输入层,Lt为参考特征层,L′为内容提取后特征层,Lt′为纹理提取后特征层,L″为重建后特征层,s是全局平均池化后的全局信息,z表示降尺度紧凑特征,v1和v2是特征描述符,s1、s2是选择权重矩阵。
1.2.2 前景背景平衡损失函数
在目标识别算法中,正负样本的数量和比例设置对算法精度有着显著的影响,但基于锚框和无锚框的框架都遵循密集预测的范式,在训练过程中会产生大量的背景样本[36]。常见的全局损失会使背景被过分表达,导致像素占比少的小目标信息表达欠缺,造成正负样本不均衡问题(图5)。
FB Loss损失函数由全局重建损失和局部正样本损失两部分组成,通过扩大正样本在训练过程中的梯度贡献,平衡前景和背景的特征表现。
图5 正负样本不均衡
全局重建损失L用于指导SKFTT模块生成的高分辨率特征图整体信息重建,使用L1范数作为损失函数保证经过超分辨率模块后的特征与背景特征相似。
式中为重建后特征图,L为目标特征图,本文使用主干网络中2倍尺寸的特征图作为目标特征图来监督生成高分辨率特征图。
局部正样本损失L用于加强对正样本高频信息关注。
式中P为真实框,为目标区域像素数量,(,)为特征图上像素点坐标。
FB Loss损失函数(L)是全局重建损失和局部正样本损失的加权。
式中是正样本权重平衡因子,本文中根据经验设为1。
改进后的YOLOv5总损失L包含置信度损失L、边界框损失L与前景背景平衡损失L。
式中L使用二进制交叉熵损失函数(binary cross entropy loss, BCE Loss)计算,L使用交并比损失函数(generalized intersection over union loss, GIoU Loss)计算,平衡系数、、分别设为1.00、0.05、0.01。
1.3 试验环境与评价指标
试验用计算机硬件为Intel® Xeon(R) CPU E5-2678 v3 @ 2.50GHz,NVIDIA GeForce RTX 3090 GPU,在Ubuntu18.04系统完成,试验环境为Pytorch1.9.0,CUDA11.1。训练过程分为在COCO公开数据集[37]上预训练和在松枯死木数据集上微调训练。微调训练过程中的参数设置如下:批处理尺寸设置为8,训练轮次200轮,初始学习率为0.01,采用SGD优化器,动量设为0.937,权重衰减为0.000 5。
研究采用COCO评价标准[38],使用mAP50、mAP75、mAP50~95、APsmall、APmid、APlarge等指标综合评价松枯死木识别模型性能,其中AP计算均使用101插值法计算。mAP50、mAP75是交并比(intersection over union, IoU)阈值为0.5、0.75时模型的平均精度均值,通常在高阈值下的识别精度更贴近真实标注。mAP50~95表示从0.5到0.95间以步长0.05取IoU阈值下的平均mAP,APsmall、APmid、APlarge分别表示小、中、大目标的平均精度,依据目标检测领域中通用数据集COCO物体定义,小目标是小于32×32个像素点的目标,中目标是在32×32~96×96个像素点之间的目标,大目标是指大于96×96个像素点的目标。
2 结果与分析
2.1 松枯死木识别消融试验
为了验证各个改进模块的优化作用,在测试集上设计了消融对比试验,结果如表1所示。加入SKFTT模块后,mAP50提高了3.0个百分点,APsmall提升了11.9个百分点;再加入FB Loss后,进一步有效提升了小目标识别精度,与第二行相比,mAP50提升了0.2个百分点,APsmall提升了3.9个百分点。综上所述,本文提出算法的mAP50达到92.7%,相较于YOLOv5算法mAP50提高了3.2个百分点,mAP50~95提升了8.3个百分点,APsmall、APmid与APlarge则分别提升了15.8、8.0、5.7个百分点。同时,添加了SKFTT模块后,参数量和计算量有一定程度增加,FPS有一定下降,但能满足对松材线虫病防治的速度需求。
表1 YOLOv5不同改进方法的消融试验结果对比
注:SKFTT为选择性核特征纹理迁移模块;FB Loss为前景背景平衡损失函数;mAP50和mAP75是IoU为0.5和0.75时模型的mAP;mAP50~95是从0.5到0.95间以步长0.05取IoU阈值下的平均mAP;APsmall、APmid、APlarge分别表示小、中、大目标的平均精度;“√”表示加入该模块,“-”表示不执行此操作。下同。
Note: SKFTT is the selective kernel feature texture transfer module; FB Loss is the foreground background balance loss function; mAP50and mAP75refer to the mAP of the model when the IoU threshold is 0.5 and 0.75; mAP50~95indicates the average mAP at the IoU threshold in steps of 0.05 from 0.5 to 0.95; APsmall, APmid, and APlargemean AP for small, medium, and large targets, respectively; “√” indicates joining the module, and “-” means that this operation is not performed. The same below.
本研究采用的无人机影像包括不同光照、不同角度、不同尺度的实际林业生产场景,其中有大量小尺寸目标以及枝叶遮挡、过曝或逆光的边界和纹理模糊样本。改进前后的松枯死木识别模型效果对比如图6所示,可以看出加入SKFTT与FB Loss后,预测框的分值增大,小目标的漏检情况得到改善,轻度遮挡所致的边界模糊问题得到缓解,过曝与逆光环境下模糊目标识别结果也有所提升。但仍有异形样本存在漏检情况,因为此类样本不充分,训练不足影响识别结果。同时,逆光环境下松枯死木纹理过于模糊、图片所含信息过少,也存在少量漏检情况。
正常Normal 遮挡Occlusion 过曝Overexposure 逆光Backlight a. 人工标注a. Manual annotationb. YOLOv5c. YOLOv5+SKFTTd. YOLOv5+SKFTT+FB Loss
2.2 不同目标识别算法比较
为验证改进算法识别优势,本文与其他先进的目标识别算法在测试集上进行对比试验,包括两阶段的Faster R-CNN算法[39],一阶段基于锚框的YOLOv4算法[40],一阶段无锚框的YOLOX算法[41]、MT-YOLOv6算法[42],小目标识别算法QueryDet[43],以及松枯死木识别算法DDYOLOv5[19],结果如表2所示。
表2表明,在松枯死木测试集上,本文提出的改进型YOLOv5算法识别精度最佳。算法相较于Faster R-CNN、YOLOv4、YOLOX、MT-YOLOv6、QueryDet和DDYOLOv5的mAP50分别提高16.7、15.3、2.5、2.8、12.3和1.2个百分点,APsmall分别提高46.0、21.1、0.5、1.3、5.2和2.7个百分点。综上,本文提出的改进算法对图像信息具有较强的学习能力,能有效提升小目标的识别精度。
2.3 不同算法的检测层特征图可视化
随着特征网络的加深,特征越来越抽象,特征可视化有助于理解深度学习网络识别松枯死木的过程。在图7中,本文将带SKFTT的改进后算法与超分辨率算法BSRGAN[44]、realESRGAN[45]、Swin IR[46]、FTT[34]进行可视化对比。
可以看出,2倍特征图与原图中目标较为相似,说明使用2倍特征图监督SKFTT模块重建高分辨率特征图是可行的。同时,相比于先用超分辨率算法重建高清图像,再提取得到特征图,经过SKFTT模块重建的特征层有丰富的区域细节。相比FTT等其他超分辨算法,利用SKFTT模块获得的目标边界更加清晰、纹理更加丰富,这有利于网络对小目标与纹理模糊目标的特征进行提取。
表2 不同算法在测试集上的识别结果
注:原模型特征图与超分辨率算法BSRGAN、realESRGAN、Swin IR的特征图均来自YOLOv5扩展一层小目标检测层后模型,FTT算法特征图由其替换改进后模型中SKFTT得到。
3 结 论
在生产实践中,受山地地形起伏大、无人机飞行高度高的限制,导致部分松枯死木在图像中尺度小且纹理模糊,识别困难。为此本文提出了一种结合超分辨率重建的YOLOv5算法用于松枯死木识别在交并比阈值取0.5时的平均精度均值。
该算法在特征融合网络中加入超分辨率模块SKFTT和前景背景平衡损失函数FB Loss来挖掘不同尺度的关键信息并优化重建特征。改进后算法松枯死木mAP50达到92.7%,小目标的平均精度为53.2%,相比于原算法mAP50提高了3.2个百分点,APsmall提升了15.8个百分点,FPS为37帧/s。将改进后算法与常用目标识别算法对比,其精度优于Faster R-CNN、YOLOv4、YOLOX、MT-YOLOv6、QueryDet、DDYOLOv5等深度学习算法。
通过SKFTT模块和其他超分辨率算法的重建特征图可视化分析,发现经过SKFTT模块重建特征图后,高分辨率的特征信息得到充分利用,目标的细节纹理得到补充,有利于小目标和纹理模糊目标的特征提取。
[1] 叶建仁. 松材线虫病在中国的流行现状、防治技术与对策分析[J]. 林业科学,2019,55(9):1-10.
YE Jianren. Epidemic status of pine wilt disease in China and its prevention and control techniques and counter measures[J]. Scientia Silvae Sinicae, 2019, 55(9): 1-10. (in Chinese with English abstract)
[2] 张晓东,杨皓博,蔡佩华,等. 松材线虫病遥感监测研究进展及方法述评[J]. 农业工程学报,2022,38(18):184-194.
ZHANG Xiaodong, YANG Haobo, CAI Peihua, et al. Research progress on remote sensing monitoring of pine wilt disease[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(18): 184-194. (in Chinese with English abstract)
[3] 陶欢,李存军,谢春春,等. 基于HSV阈值法的无人机影像变色松树识别[J]. 南京林业大学学报(自然科学版),2019,43(3):99-106.
TAO Huan, LI Cunjun, XIE Chunchun, et al. Recognition of red-attack pine trees from UAV imagery based on the HSV threshold method[J]. Journal of Nanjing Forestry University (Natural Sciences Edition), 2019, 43(3): 99-106. (in Chinese with English abstract)
[4] 刘遐龄,程多祥,李涛,等. 无人机遥感影像的松材线虫病危害木自动监测技术初探[J]. 中国森林病虫,2018,37(5):16-21.
LIU Xialing, CHENG Duoxiang, LI Tao, et al. Preliminary study on automatic monitoring trees infected by pine wood nematode with high resolution images from unmanned aerial vehicle[J]. Forest Pest and Disease, 2018, 37(5): 16-21. (in Chinese with English abstract)
[5] 刘金沧,王成波,常原飞. 基于多特征CRF的无人机影像松材线虫病监测方法[J]. 测绘通报,2019(7):78-82.
LIU Jincang, WANG Chengbo, CHANG Yuanfei. Monitoring method of bursaphelenchus xylophilus based on multi-feature CRF by UAV image[J]. Bulletin of Surveying and Mapping, 2019(7): 78-82. (in Chinese with English abstract)
[6] WU X, SAHOO D, HOI S C H. Recent advances in deep learning for object detection[J]. Neurocomputing, 2020, 396: 39-64.
[7] 李嘉祺,吴开华,张垚,等. 基于无人机光谱遥感和AI技术建立松材线虫害监测模型[J]. 电子技术与软件工程,2021(8):91-94.
[8] 徐信罗,陶欢,李存军,等. 基于Faster R-CNN的松材线虫病受害木识别与定位[J]. 农业机械学报,2020,51(7):228-236.
XU Xinluo, TAO Huan, LI Cunjun, et al. Detection and location of pine wilt disease induced dead pine trees based on Faster R-CNN[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51(7): 228-236. (in Chinese with English abstract)
[9] DENG X, TONG Z, LAN Y, et al. Detection and location of dead trees with pine wilt disease based on deep learning and UAV remote sensing[J]. AgriEngineering, 2020, 2(2): 294-307.
[10] 毛锐,张宇晨,王泽玺,等. 利用改进Faster-RCNN识别小麦条锈病和黄矮病[J]. 农业工程学报,2022,38(17):176-185.
MAO Rui, ZHANG Yuchen, WANG Zexi, et al. Recognizing stripe rust and yellow dwarf of wheat using improved Faster-RCNN[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(17): 176-185. (in Chinese with English abstract)
[11] WU B, LIANG A, ZHANG H, et al. Application of conventional UAV-based high-throughput object detection to the early diagnosis of pine wilt disease by deep learning[J]. Forest Ecology and Management, 2021, 486: 118986.
[12] LIM W, CHOI K, CHO W, et al. Efficient dead pine tree detecting method in the Forest damaged by pine wood nematode () through utilizing unmanned aerial vehicles and deep learning-based object detection techniques[J]. Forest Science and Technology, 2022, 18(1): 36-43.
[13] 陈锋军,朱学岩,周文静,等. 基于无人机航拍与改进YOLOv3模型的云杉计数[J]. 农业工程学报,2020,36(22):22-30.
CHEN Fengjun, ZHU Xueyan, ZHOU Wenjing, et al. Spruce counting method based on improved YOLOv3 model in UAV images[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(22): 22-30. (in Chinese with English abstract)
[14] 孙钰,周焱,袁明帅,等. 基于深度学习的森林虫害无人机实时监测方法[J]. 农业工程学报,2018,34(21):74-81.
SUN Yu, ZHOU Yan, YUAN Mingshuai, et al. UAV real-time monitoring for forest pest based on deep learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(21): 74-81. (in Chinese with English abstract)
[15] 黄丽明,王懿祥,徐琪,等. 采用YOLO算法和无人机影像的松材线虫病异常变色木识别[J]. 农业工程学报,2021,37(14):197-203.
HUANG Liming, WANG Yixiang, XU Qi, et al. Recognition of abnormally discolored trees caused by pine wilt disease using YOLO algorithm and UAV images[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(14): 197-203. (in Chinese with English abstract)
[16] SUN Z, IBRAYIM M, HAMDULLA A. Detection of pine wilt nematode from drone images using UAV[J]. Sensors, 2022, 22(13): 4704.
[17] LI F, LIU Z, SHEN W, et al. A remote sensing and airborne edge-computing based detection system for pine wilt disease[J]. IEEE Access, 2021, 9: 66346-66360.
[18] LI X, LIU Y, HUANG P, et al. Integrating multi-scale remote-sensing data to monitor severe forest infestation in response to pine wilt disease[J]. Remote Sensing, 2022, 14(20): 5164.
[19] HU G, YAO P, WAN M, et al. Detection and classification of diseased pine trees with different levels of severity from UAV remote sensing images[J]. Ecological Informatics, 2022, 72: 101844.
[20] YOU J, ZHANG R, LEE J. A deep learning-based generalized system for detecting pine wilt disease using RGB-based UAV images[J]. Remote Sensing, 2021, 14(1): 150.
[21] HA V K, REN J C, XU X Y, et al. Deep learning based single image super-resolution: A survey[J]. International Journal of Automation and Computing, 2019, 16(4): 413-426.
[22] DONG C, LOY C C, HE K, et al. Learning a deep convolutional network for image super-resolution[C]// Proceedings of the European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 184-199.
[23] 韩巧玲,周希博,宋润泽,等. 基于序列信息的土壤CT图像超分辨率重建[J]. 农业工程学报,2021,37(17):90-96.
HAN Qiaoling, ZHOU Xibo, SONG Runze, et al. Super-resolution reconstruction of soil CT images using sequence information[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(17): 90-96. (in Chinese with English abstract)
[24] CHEN H, WANG Y, GUO T, et al. Pre-trained image processing transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA: IEEE, 2021: 12299-12310.
[25] WANG B, YAN B, JEON G, et al. Lightweight dual mutual-feedback network for artificial intelligence in medical image super-resolution[J]. Applied Sciences, 2022, 12(24): 12794.
[26] ALWAKID G, GOUDA W, HUMAYUN M, et al. Melanoma detection using deep learning-based classifications[J]. Healthcare, 2022, 10(12): 2481.
[27] 车荧璞,王庆,李世林,等. 基于超分辨率重建和多模态数据融合的玉米表型性状监测[J]. 农业工程学报,2021,37(20):169-178.
CHE Yingpu, WANG Qing, LI Shilin, et al. Monitoring of maize phenotypic traits using super-resolution reconstruction and multimodal data fusion[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(20): 169-178. (in Chinese with English abstract)
[28] SHERMEYER J, VAN ETTEN A. The effects of super-resolution on object detection performance in satellite imagery[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Long Beach, CA, USA: IEEE, 2019: 1432-1441.
[29] 赵军艳,降爱莲,强彦. YOLOv3融合图像超分辨率重建的鲁棒人脸检测[J]. 计算机工程与应用,2022,58(19):250-256.
ZHAO Junyan, JIANG Ailian, QIANG Yan. Robust Face Detection Using YOLOv3 Fusion Super Resolution Reconstruction[J]. Computer Engineering and Applications, 2022, 58(19): 250-256. (in Chinese with English abstract)
[30] 奉志强,谢志军,包正伟,等. 基于改进YOLOv5的无人机实时密集小目标检测算法[J/OL]. 航空学报,(2022-05-11)[2022-08-13].http://kns.cnki.net/kcms/detail/11.1929.V.20220509.2316.010.html.
FENG Zhiqiang, XIE Zhijun, BAO Zhengwei, et al. Real-time dense small target detection algorithm for unmanned aerial vehicle based on improved YOLOv5[J/OL]. Acta Aeronautica et Astronautica Sinica, (2022-05-11) [2022-08-13].http://kns.cnki.net/kcms/detail/11.1929.V.20220509.2316.010.html. (in Chinese with English abstract)
[31] BOSCH M, GIFFORD C M, RODRIGUEZ P A. Super-resolution for overhead imagery using densenets and adversarial learning[C]//Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Lake Tahoe, NV, USA: IEEE, 2018: 1414-1422.
[32] BAI Y, ZHANG Y, DING M, et al. Sod-mtgan: Small object detection via multi-task generative adversarial network[C]// Proceedings of the European Conference on Computer Vision (ECCV). Munich, Germany: Springer, 2018: 206-221.
[33] NOH J, BAE W, LEE W, et al. Better to follow, follow to be better: Towards precise supervision of feature super-resolution for small object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE, 2019: 9725-9734.
[34] DENG C, WANG M, LIU L, et al. Extended feature pyramid network for small object detection[J]. IEEE Transactions on Multimedia, 2021, 24: 1968-1979.
[35] GLENN Jocher, ALEX Stoken, JIRKA Borovec, et al. ultralytics/yolov5: v6.0-YOLOv5n ‘Nano’ models, Roboflow integration, TensorFlow export, OpenCV DNN support[EB/OL]. (2021-10-12)[2022-10-19]. https://github.com/ultralytics/yolov5.
[36] CHEN J, WU Q, LIU D, et al. Foreground-background imbalance problem in deep object detectors: A review[C]//Proceedings of the 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). Shenzhen, China: IEEE, 2020: 285-290.
[37] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]//Proceedings of the European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 740-755.
[38] CHEN X, FANG H, LIN T Y, et al. Microsoft coco captions: Data collection and evaluation server[EB/OL]. (2015-04-01)[2022-10-19]. https://arxiv.org/abs/ 1504.00325.
[39] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.
[40] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[EB/OL]. (2020-04-23)[2022-11-06]. https://arxiv.org/abs/2004.10934.
[41] GE Z, LIU S, WANG F, et al. Yolox: Exceeding yolo series in 2021[EB/OL]. (2021-08-06)[2022-11-06]. https://arxiv.org/ abs/2107.08430.
[42] LI C, LI L, JIANG H, et al. YOLOv6: A single-stage object detection framework for industrial applications[EB/OL]. (2022-09-07)[2022-11-06]. https://arxiv.org/abs/2209.02976.
[43] YANG C, HUANG Z, WANG N. QueryDet: Cascaded sparse query for accelerating high-resolution small object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, LA, USA: IEEE, 2022: 13668-13677.
[44] ZHANG K, LIANG J, VAN Gool L, et al. Designing a practical degradation model for deep blind image super-resolution[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, QC, Canada: IEEE, 2021: 4791-4800.
[45] WANG X, XIE L, DONG C, et al. Real-esrgan: Training real-world blind super-resolution with pure synthetic data[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, BC, Canada: IEEE, 2021: 1905-1914.
[46] LIANG J, CAO J, SUN G, et al. Swinir: Image restoration using swin transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, BC, Canada: IEEE, 2021: 1833-1844.
Recognition of dead pine trees using YOLOv5 by super-resolution reconstruction
WANG Wenjin1, YOU Ziyi2, SHAO Lijiang1, LI Xiaolin1, WU Songqing2,3, ZHANG Zhuhe4, HUANG Shiguo1,3※, ZHANG Feiping2,3
(1.,,350002,; 2.,350002,; 3.350002,; 4.350002,)
Pine wilt disease has posed a significant threat to forest ecosystems, due to its highly contagious and destructive nature. The critical step in the prevention and control of pine wilt is eliminating the disease sources, which requires the accurate recognition and removal of dead pine trees. However, small or blurred targets are captured, such as overexposure, backlight, and samples occluded by foliage in practical applications. The reason is that the UAVs have to fly high for capture, due to the geography of the hilly and mountain areas. In this study, a novel You Look Only Once v5 (YOLOv5) algorithm was proposed for the recognition of dead pine trees. The super-resolution reconstruction was performed at the feature level, in order to overcome the challenge of recognizing such targets. The YOLOv5 structure was redesigned in two aspects. Firstly, the Selective Kernel Feature Texture Transfer (SKFTT) module was adopted to create the high-resolution detection feature maps with detailed textures, where improved detection accuracy was obtained for the small targets and blurred targets. Specifically, the feature maps with high texture were selected from the backbone network, whereas, the feature maps with high semantics were selected from the feature fusion network. These feature maps were then sent to the texture extractor and content extractor. A selective feature fusion module was used to fuse the critical information about different scales using their weights. Secondly, the Foreground Background Loss function (FB Loss) was introduced to attenuate useless features, while enhancing the gradient contribution of positive samples, and balancing the distribution of positive and negative samples, in order to supervise the reconstruction of high-resolution feature maps. Furthermore, the dataset was obtained to validate the effectiveness of the improved model from the approximately 15 400 hectares of forest land located in Fuzhou and Minhou City, Fujian Province, China. The UAV images were subsequently cropped and screened to obtain about 29 250 labelled samples for further experiments. A series of ablation tests and visualizations were conducted on the testing datasets to verify the effectiveness. Experimental results showed that the mean Average Precision (mAP50) of the improved model was 92.7%, mAP50~95was 62.1%, and APsmallwas 53.2%. Compared with the baseline model, the improved model was achieved in the increases of 3.2, 8.3, and 15.8 percentage points in the mAP50, mAP50~95, and mAPsmall, respectively. The mAP50of the improved model was 16.7, 15.3, 2.5, 2.8, 12.3, and 1.2 percentage points higher than that of the Faster R-CNN, YOLOv4, YOLOX, MT-YOLOv6, QueryDet, and DDYOLOv5 networks, respectively. In addition, the improved model was achieved in the frames per second FPS of 37, which fully met the detection requirements of dead pine trees. Visualization results showed that the improved model can be expected to serve as the recognition of occlusion, overexposure, and backlight targets. The feature maps of the small target detection layer were visualized with different super-resolution algorithms to facilitate observation. The comparison revealed that the texture was improved with the apparently clear boundary shape. In conclusion, the detecting challenge of small targets and blurred targets can be effectively alleviated using the improved dead pine tree detection algorithm, due to the high accuracy. Therefore, the improved detection algorithm is conducive to the efficient removal and comprehensive prevention/control of diseased trees. The improved model can greatly contribute to accelerating the “digital forest prevention” process in precision agriculture.
UAV; image recognition; dead pine trees; small target detection; image super-reconstruction; feature fusion
10.11975/j.issn.1002-6819.202211141
TP391.4
A
1002-6819(2023)-05-0137-09
王文瑾,游子绎,邵历江,等. 融合超分辨率重建的YOLOv5松枯死木识别模型[J]. 农业工程学报,2023,39(5):137-145.doi:10.11975/j.issn.1002-6819.202211141 http://www.tcsae.org
WANG Wenjin, YOU Ziyi, SHAO Lijiang, et al. Recognition of dead pine trees using YOLOv5 by super-resolution reconstruction[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(5): 137-145. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.202211141 http://www.tcsae.org
2022-11-15
2023-02-13
福建省林业科技项目(闽林文[2021]35号);国家林业和草原局重大应急科技项目(ZD202001);福建农林大学科技创新专项基金项目(KFb22097XA)
王文瑾,研究方向为计算机视觉。Email:1201193020@fafu.edu.cn
黄世国,博士,教授,研究方向为农林业计算机应用、计算机视觉。Email:sghuang@fafu.edu.cn