张垚鑫1,2,朱荣光1,2※,孟令峰1,马 蓉1,王世昌1,白宗秀1,崔晓敏1
(1. 石河子大学机械电气工程学院,石河子 832003;2. 农业农村部西北农业装备重点实验室,石河子 832003)
针对传统图像分类模型泛化性不强、准确率不高以及耗时等问题,该研究构建了一种用于识别不同部位羊肉的改进ResNet18网络模型,并基于智能手机开发了一款可快速识别不同部位羊肉的应用软件。首先,使用数据增强方式对采集到的羊背脊、羊前腿和羊后腿肉的原始手机图像进行数据扩充;其次,在ResNet18网络结构中引入附加角裕度损失函数(ArcFace)作为特征优化层参与训练,通过优化类别的特征以增强不同部位羊肉之间的类内紧度和类间差异,同时将ResNet18网络残差结构中的传统卷积用深度可分离卷积替换以减少网络参数量,提高网络运行速度;再次,探究了不同优化器、学习率和权重衰减系数对网络收敛速度和准确率的影响并确定模型参数;最后,将该网络模型移植到安卓(Android)手机以实现不同部位羊肉的移动端检测。研究结果表明,改进ResNet18网络模型测试集的准确率高达97.92%,相比ResNet18网络模型提高了5.92个百分点;把改进ResNet18网络模型部署到移动端后,每张图片的检测时间约为0.3 s。该研究利用改进ResNet18网络模型结合智能手机图像实现了不同部位羊肉的移动端快速准确分类,为促进羊肉的智能化检测及羊肉市场按质论价提供了技术支持。
0 引 言
目前,用于肉制品检测的技术主要有近红外光谱、高光谱成像以及传感器(电子鼻、电子舌)检测等[7]。Sanz等[8]将高光谱成像技术与机器学习算法相结合,对羔羊的背长肌、腰大肌、半膜肌进行分类,准确率为96.67%。Kamruzzaman等[9]采用高光谱成像技术结合主成分分析算法对夏洛莱羊的半腱肌、背阔肌和腰大肌进行了分类,准确率高达100%。但上述方法因成本高和操作复杂等缺点难以应用推广。随着手机运算速度的提高,基于移动端的肉制品检测研究逐渐增多[10-12]。赵鑫龙等[10]基于智能手机开发了一种用于牛肉大理石花纹检测的软件,检测准确率为95.56%,且单张检测时间低于0.5 s。然而,有关不同部位羊肉分类判别应用软件的研究仍然较少,孟令峰等[11]利用反向传播神经网络对基于手机图片的不同部位羊肉进行分类,并开发了相应的手机应用软件,但其准确率为90.94%,方法精度有限。
1 材料与方法
1.1 图像采集与预处理
1.1.1 羊肉样本图像采集
本研究试验样本采购于石河子市中心农贸市场,样本分别取自6只小尾寒羊(6~8月龄),在进行约30 h的排酸后送至石河子大学农畜产品实验室进行样本制备。试验制备的羊背脊、羊前腿和羊后腿肉样本的长、宽、高约为40 mm、30 mm、10 mm,将其真空包装后置于4 ℃冰箱内冷藏。为避免外界自然光照对试验产生影响,使采集环境更加稳定,整个试验过程均在封闭环境下进行,并对光源进行补偿。图像采集前,将试验样本从冰箱中取出,待其恢复至室温后使用手机(华为P10,华为技术有限公司,中国)进行图像采集,分别于每天的10:00和22:00进行2次图像采集,连续采集12 d。拍摄过程中,手机摄像头位于羊肉样本正上方12 cm的位置。试验剔除提前腐败的异常样本后,得到羊背脊、羊前腿和羊后腿样本各14 个,共计1 008张图像,图像格式为.jpg,图像分辨率为2 976×3 968像素。
1.1.2 图像数据预处理
本研究为提高网络模型的适应性与泛化性,对获得的羊肉图片采用随机旋转、水平和垂直翻转、调节亮度饱和度对比度、添加高斯模糊和椒盐噪声等方式进行数据集扩充[30]。扩充后的数据集数量为原来的9倍,共10 080张图片,不同扩充方式下的羊肉图片示例如图1所示。本试验所采用的数据集包括羊前腿、羊后腿和羊背脊3类不同部位羊肉,为保证样本量均衡,从每类样本中随机选取2 000张共计6 000张不同部位羊肉的手机图片,按照4∶1的比例划分为训练集(4 800张)和测试集(1 200张)。
1.2 基于改进ResNet18网络的不同部位羊肉分类模型构建
1.2.1 ResNet18网络
1.2.2 附加角裕度损失函数(ArcFace)
1.2.3 深度可分离卷积
1.2.4 改进ResNet18网络模型构建
1.2.5 试验环境
羊肉部位分类模型训练的试验环境:硬件包括Intel® CoreTMi7-6700KCPU @ 3.40 GHz处理器,40 GB内存和NVIDIA GeForce RTX 2080 Ti 显卡(11 GB 显存)等,软件包括操作系统Windows 10(64位)、编程语言Python 3.6.5、深度学习框架Pytorch1.1.0、通用计算架构CUDA 10.0和GPU加速库CUDNN 7.4.1。手机APP开发及软件测试的环境:硬件为内存64 GB的华为手机(P10,华为技术有限公司,中国),软件包括Android8.0操作系统和Android Studio安卓应用软件开发环境。
1.2.6 评价指标
图2 改进ResNet18网络模型的结构示意图
Fig.2 Structure diagram of improved ResNet18 network model
2 结果与分析
2.1 附加角裕度损失函数(ArcFace)对特征分布的影响分析
表1 ResNet18网络、ResNet18_ArcFace网络特征分布的对比分析
2.2 深度可分离卷积对网络模型参数量的影响分析
2.3 模型参数对改进ResNet18网络模型的影响分析
为研究训练过程中的模型参数对改进ResNet18网络模型的影响,本研究分别选用随机梯度下降(Stochastic Gradient Descent,SGD)优化器和自适应矩估计优化器(Adaptive moment estimation,Adam)两种优化器,并分别设置学习率为0.01和0.001,权重衰减系数为0和0.000 5对模型进行训练,并对比分析不同参数对模型准确率的影响。测试集的准确率随参数的变化趋势如图4所示。由图4可知,Adam比SGD优化器收敛速度更快,但网络模型准确率波动较大。模型参数对改进ResNet18网络模型影响的具体情况如表2所示,当采用学习率为0.01,权重衰减系数为0.000 5的SGD优化器时,测试集的准确率为97.92%,且准确率曲线趋势更加平稳。
表2 改进ResNet18网络模型在不同参数下准确率的对比分析
Note: It is invalid for adding weight decay coefficient in Adam optimization algorithm.
2.4 改进ResNet18网络模型与其他网络模型的对比分析
本研究中所有网络模型均使用迁移学习的方式进行训练,冻结网络中除全连接层之外的所有网络层,只对最后一层进行训练。训练模型时采用SGD优化器,学习率设置为0.01,权重衰减系数设置为0.000 5。试验过程中的批次样本数为32,最大轮数为100轮。
表3 ResNet18、改进ResNet18和MobileNet网络模型分类结果对比
改进ResNet18网络模型的混淆矩阵如表4所示。由表4可知,使用改进ResNet18网络模型对1 200个不同部位羊肉数据集进行测试,仅25个不同部位的羊肉图像分类错误,分类效果良好。另外,本研究通过进一步分析不同部位羊肉数据集的错分情况可知,不同部位羊肉的颜色和纹理特征是其分类过程中的重要依据,当样本与背景颜色区分度不大和样本纹理不清晰(模糊或有噪声)时,则容易发生误分。误分样本大多为亮度较大、饱和度较低、添加了椒盐噪声和模糊的图片,其较为复杂的背景影响了样本的准确识别。
表4 改进ResNet18网络模型的混淆矩阵
2.5 不同部位羊肉分类手机APP实现
为了实现不同部位羊肉精准分类在移动端的快速检测,本研究采用Pytorch Mobile框架将训练好的改进ResNet18网络模型部署到Android设备中。首先,将训练好的改进ResNet18网络模型转换成TorchScript模型,并保存为相应的.pt格式。然后,在Android Studio软件环境中开发羊肉部位分类手机应用APP,APP主要包括前端界面和后端处理。前端界面主要由.xml文件进行布局,通过添加文本和按钮组件实现羊肉图片和检测结果的显示。后端处理通过编写Java语言实现,包括图像获取、图像处理和模型判别功能。在运用APP对不同部位羊肉进行识别时,首先,使用图像获取功能采集图像,然后,利用图像处理功能将获取图像的大小压缩至224×224像素并存储,最后,调用.pt格式的TorchScript模型对压缩后的图片进行识别。利用测试集1 200张图片对羊肉部位分类手机APP进行测试,得出每张图片的检测时间约为0.3 s。
3 结 论
3)将本研究提出的改进ResNet18模型转化为TorchScript模型移植到移动端后,所开发的羊肉检测应用软件能够实现对不同部位羊肉快速准确分类,每张图像的检测时间约为0.3 s。
Classification of mutton location on the animal using improved ResNet18 network model and mobile application
Zhang Yaoxin1,2, Zhu Rongguang1,2※, Meng Lingfeng1, Ma Rong1, Wang Shichang1, Bai Zongxiu1, Cui Xiaomin1
Accurate and timely detection of meat parts has gradually been highly demanding in meat consumption. However, the traditional image classification cannot clearly distinguish the similar color and texture characteristics for different mutton parts under different storage time, particularly with the low generalization and time-consuming. In this study, an improved ResNet18 network model was proposed to classify the different mutton parts, while, the corresponding mobile application software was developed using the optimal model. Firstly, 1 008 mutton images of loin, hind shank, and fore shank under different storage times (0-12 d) were collected, and then 9 types of data-augmentation were used to expand the dataset. After that, 6 000 images were randomly selected from the augmented dataset for modeling, where 80% of the images were used as the training dataset, and the remainder was used as the test dataset. Secondly, Additive Angular Margin Loss (ArcFace) and the depthwise separable convolution were introduced into the ResNet18 network for the improved one. Thirdly, the improved ResNet18 network wastrained with the augmented images of different mutton parts. Meanwhile, an evaluation was made to determine the effect of different parameters on the convergence speed and accuracy of improved ResNet18. Optimizers of stochastic gradient descent (SGD) and adaptive moment estimation (Adam), the learning rate of 0.01 and 0.001, weight decay coefficient of 0 and 0.000 5 were adopted for experimental comparison. The optimal classification model was then determined for different mutton parts. Finally, a mobile application software was developed to transplant the TorchScript model that transformed from the improved ResNet18. The results showed that the ArcFace greatly improved the distinguishability of different mutton parts, while the depthwise separable convolution significantly reduced the parameters of the network. Furthermore, the improved ResNet18 network using SGD optimizer presented a higher accuracy and more stable performance than that using the Adam in the test phase. When the learning rate was set to 0.01, the weight decay coefficient was set to 0.000 5, and the SGD optimizer was used to train the improved ResNet18 network, only 25 images of different parts of lamb were classified incorrectly in the 1 200 test sets, where the classification accuracy of the model was 97.92%, while the average classification accuracies of the loin, hind shank, and fore shank were 97.00%, 98.00%, and 98.75%, respectively. Compared with the original, the classification accuracy of the improved ResNet18 was improved by 5.92 percentage points, while the classification accuracies of loin, hind shank, and fore shank were improved by 5.75, 5.50, and 6.50 percentage points, respectively. Compared with the MobileNet model, the classification accuracy of improved ResNet18 was improved by 13.34 percentage points, while the classification accuracies of loin, hind shank, and fore shank were improved by 13.50, 10.75, and 15.75 percentage points, respectively. Moreover, the software using the improved ResNet18 quickly and accurately classified different mutton parts, where the average detection time of each image was about 0.3 s. The finding can provide the technical and theoretical support to improve the level of intelligent detection of meat products for the fair competition of the meat market.
image processing; image recognition; models; mutton; ResNet18; mobile terminal; classification of mutton parts
张垚鑫,朱荣光,孟令峰,等. 改进ResNet18网络模型的羊肉部位分类与移动端应用[J]. 农业工程学报,2021,37(18):331-338.doi:10.11975/j.issn.1002-6819.2021.18.038 http://www.tcsae.org
Zhang Yaoxin, Zhu Rongguang, Meng Lingfeng, et al. Classification of mutton location on the animal using improved ResNet18 network model and mobile application[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(18): 331-338. (in Chinese with English abstract) doi:10.11975/j.issn.1002-6819.2021.18.038 http://www.tcsae.org