APP下载

Soybean Leaf Morphology Classification Based on FPN-SSD and Knowledge Distillation

2021-01-15YuXiaoFuLirenDaiBaishengandWangYecheng

Yu Xiao , Fu Li-ren Dai Bai-sheng , and Wang Ye-cheng

1 College of Electrical Engineering and Information, Northeast Agricultural University, Harbin 150030, China

2 School of Computer Science and Technology, Shandong University of Technology, Zibo 255000, Shandong, China

3 Key Laboratory of Agricultural Internet of Things, Ministry of Agriculture Rural Affairs, Yangling 712100, Shaanxi, China

4 College of Engineering, Northeast Agricultural University, Harbin 150030, China

Abstract: Soybean leaf morphology is one of the most important morphological and biological characteristics of soybean. The germplasm gene differences of soybeans can lead to different phenotypic traits, among which soybean leaf morphology is an important parameter that directly reflects the difference in soybean germplasm. To realize the morphological classification of soybean leaves, a method was proposed based on deep learning to automatically detect soybean leaves and classify leaf morphology. The morphology of soybean leaves included lanceolate, oval, ellipse and round. First, an image collection platform was designed to collect images of soybean leaves. Then, the feature pyramid networks-single shot multibox detector (FPN-SSD) model was proposed to detect the top leaflets of soybean leaves on the collected images. Finally, a classification model based on knowledge distillation was proposed to classify different morphologies of soybean leaves. The obtained results indicated an overall classification accuracy of 0.956 over a private dataset of 3 200 soybean leaf images, and the accuracy of classification for each morphology was 1.00, 0.97, 0.93 and 0.94. The results showed that this method could effectively classify soybean leaf morphology and had great application potential in analyzing other phenotypic traits of soybean.

Key words: leaf morphology classification, feature pyramid networks-single shot multibox detector (FPN-SSD), knowledge distillation, top leaflet detection

Introduction

The tolerance of soybean varieties is mainly manifested by external characteristics; among them,soybean leaf morphology is an important and effective external manifestation of soybean germplasm gene differences (Haileet al., 1998). Unfortunately, most of the existing soybean leaf morphology classification methods are still observed with the naked eye, which requires a large amount of agronomic knowledge and is very time-consuming. To overcome these problems,an automated method is needed to detect the top leaflets of soybean leaves and to classify leaves of different morphologies.

In recent years, with the rapid development of computer vision, machine learning and deep learning have been increasingly applied in the agricultural field. Many agricultural problems can be effectively solved by introducing computer image processing and machine learning technology (Zhanget al., 2019; Jia and Ji, 2013; Chen, 2019; Liu, 2018; Maet al., 2017).However, during classification and recognition using machine learning, it is necessary to manually select features and then use image processing technology to extract features that have strong subjective factors,which is time-consuming, laborious and inefficient.Deep learning can effectively extract the deep features of images and achieve good results in the field of image classification and recognition (Luo, 2019; Suet al., 2019; Liet al., 2019; Qiuet al., 2019; Wang,2019; Zhanget al., 2019). Although the application of machine learning or deep learning for crop classification has been studied since the 1980s, there have been few studies worldwide on the detection and classification of soybean leaf morphology. There are only text descriptions and picture examples, and there are no specific quantitative standards. Therefore, this paper proposed the FPN-SSD model for top leaflet detection of soybean leaves and a leaf morphology classification model based on knowledge distillation.

First, a soybean leaf image acquisition device was set up to avoid nonlinear deformation of soybean leaves collected in the field. Then, to have better robustness,top leaflet detection of soybean leaves was performed on the acquired images. Finally, the classification model was trained by knowledge distillation to extract the deep features of soybean leaf morphology, and the automatic classification of soybean leaf morphology was realized. The experimental researches showed that the FPN-SSD model could avoid background interference and improve the robustness of leaf morphology classification. A leaf morphology classification model based on knowledge distillation could provide excellent classification performance,while providing small parameters and floating point of operations (FLOPs).

Materials and Methods

Data preparation

Since there was currently no public soybean leaf morphology dataset, a private dataset of soybean leaves could be collected. Additionally, to avoid a series of problems, such as nonlinear distortion of soybean leaf morphology, because the photography could not be completely perpendicular to the leaf, a collection device was set up for soybean leaf images, which was a closed black box with different angles lighting system inside the dark box that eliminated shadows, a camera was fixed above the black box, and an infrared connection could be used to control the data collection.The collection device and the collected sample data are shown in Fig. 1.

Fig. 1 Inside photo of collection device (a) and sample data collected (b)

The data used in this paper were obtained from soybean plants in the soybean test field of Heilongjiang Province Academy of Agricultural Sciences.To avoid classification error caused by the potential difference of soybean leaf morphology among different varieties in the classification process, during complete flowering, mature trefoil leaves from the upper and lower parts of the plants from different soybean varieties were randomly selected for picking.

The image acquisition was performed on 3 200 soybean leaves picked using an image acquisition device. Finally, a total of 3 200 soybean leaf images were obtained, 70% of which were used for training,30% for final classification, that was 2 240 images as a training set and 960 images as a test set. In addition,to increase the size of the training set and reduce the problem of overfitting, horizontal flipping, random rotation, adding noise and other operations were used to expand the training dataset. Finally, the resulting training set included 8 960 soybean leaf images.Among them, the distributions of different leaves morphology data are shown in Table 1.

Table 1 Number of categories in dataset

Soybean leaf data characteristics

The morphology of soybean leaves was based on the morphology of the top leaflet in the three-leaf compound leaves of the plant, which was divided into four morphologies: lanceolate, oval, ellipse and round, as shown in Fig. 2. If directly classified,the morphology of soybean leaves were affected by the background and other leaves, which affected the classification accuracy. Therefore, deep learning was used to detect the top leaflets in soybean leaves.

Fig. 2 Four morphologies of soybean leaves (from left to right: lanceolate, oval, ellipse and round)

The data mainly had the characteristics that the morphology differences between the classes were small, and the morphology differences within the classes were large, as shown in Fig. 3. It was difficult to extract features using traditional methods.Therefore, this paper considered using deep learning methods to classify the morphology of the top leaflets after detection.

Experimental design

Based on the characteristics of soybean leaf data, to improve the robustness of the model, the FPN-SSD model was used to detect the top leaflet of soybean leaves. Then, the knowledge distillation method was used to train the classification model so that the classification model could learn the dark knowledge of the teacher network. Finally, the trained classification model was used to classify the detected soybean leaves so that the trained classification network model had stronger generalization ability and higher classification accuracy. The technical flowchart is shown in Fig. 4.

Fig. 3 Contrast of ellipse and round leaf (left two) and contrast of two round leaves (right two)

Fig. 4 Technical flowchart

FPN-SSD model

In this paper, the FPN-SSD model was used (Linet al., 2017; Liuet al., 2016) to detect the morphology of the top leaflet in the three-leaf compound leaves of soybean leaves. The use of FPN enabled the fusion of high-level semantic information with low-level spatial information, enhancing the ability of the model to detect smaller blades.

Resnet-50 was used (Heet al., 2016) as the backbone network of FPN-SSD so that the backbone network had a larger receptive field and better feature extraction ability. Using the focal loss function, all the samples were considered to focus more on difficult learnings. The FPN-SSD model structure is shown in Fig. 5.

Fig. 5 FPN-SSD model structure

Knowledge distillation model

To further extract the deep features of soybean leaf morphology, the deep learning method was used to classify the detected soybean leaf images.Considering the excessive deep learning model,although the classification effect was better, the computer performance requirements were too high,and because the parameters and FLOPs were large, the classification speed was too slow.

The knowledge distillation method was used to train the classification model so that the network could learn the "dark knowledge" of the teacher network, reduce the parameters and FLOPs of the student network classification model and improve the classification speed of the model. Resnet -101 was used as a teacher network to train student networks. The training method divided the trained teacher network output result by the temperature coefficient "T" and then the softmax activation function was used to obtain the probability of a "soft target", which was used as a label to guide student network training. The softmax calculation formula output by the teacher network through the"soft target" was shown in formula 1, and the loss function calculation formula of the training student network was as shown in formula 2:

Where,Twas temperature coefficient;zwas the network forward propagation output; andqiwas the probability of the softmax output through the "soft target".

Where,λwas the hyperparameter in the loss function. Finally, the trained student network was used as the classification model for leaf morphology.The purpose was to allow students to learn the 'dark knowledge' from the teacher network, when it could be conducted that standard deep learning training, the network could only learn the morphology category of the leaves. However, knowledge distillation was used to train, therefore, the student network could learn the judgment of the trained teacher network for different leaf morphologies and the similarity among different categories, which could make the original smaller student network and deeper teacher network perform nearly the same, while having smaller parameters and FLOPs, and thus providing a faster classification speed. The structure of knowledge distillation model is shown in Fig. 6.

Fig. 6 Knowledge distillation model structure

Results

Experimental environment

In this paper, the experimental platform was a server equipped with Intel Core i5-8400 processor, 8 GB memory and NVIDIA GTX 1070Ti GPU and the operating system was Ubuntu 16.04. Soybean leaf images were acquired using a Canon EOS7D SLR camera. The PyTorch framework and the PyCharm development environment were used to train the detection and classification model on the computer and to classify soybean leaf morphology.

Model classification accuracy

The FPN-SSD model was first used to detect the top leaflet of soybean leaves. Then, the classification model was used based on knowledge distillation to classify soybean leaf morphology. The experimental results showed that the proposed method had not only an excellent classification accuracy, but also a small number of parameters and FLOPs.

The classification accuracy of each soybean leaf morphology was calculated, as shown in formula 3:

Where,piwas 1 in the formula, which meant that the morphology type in the image was accurately classified; otherwise, it was recorded as 0 andMrepresented the number of test data.

The accuracy of the FPN-SSD model and knowledge distillation for the classification of soybean leaf morphology is shown in Table 2.

Table 2 Accuracy of FPN-SSD model and knowledge distillation for classification of soybean leaf morphology

Classification accuracy analysis of with or without FPN-SSD model

To study the effect of the FPN-SSD model on classification results, the effect of using traditional image processing methods on the classification accuracy of soybean leaf morphology was explored.When the FPN-SSD model was not used, deep learning was used to directly classify the acquired soybean leaf images. When using traditional image processing methods, marker-based watershed segmentation was used to extract soybean leaves and then deep learning was used to classify the extracted soybean leaves. The results showed that the automatic classification method consisting of the FPN-SSD model and classification model based on the knowledge distillation had a stronger soybean leaf classification ability than other automatic classification models. The accuracy of the different automatic detection methods for the classification of soybean leaves is shown in Table 3.

Classification accuracy and FLOPs analysis of with or without knowledge distillation

To explore the effects of knowledge distillation on the accuracy and FLOPs of soybean leaf morphology classification, the effect of not using knowledge distillation on the classification accuracy and FLOPs of soybean leaf morphology was discussed. The student network and the teacher network was used to directly classify the soybean leaves detected by the FPN-SSD model. The results showed that when the classification model based on knowledge distillation was used, a smaller number of parameters and FLOPs could be achieved than other automatic classification methods while ensuring good classification accuracy.The accuracy and FLOPs of the different automatic classification methods for the classification of soybean leaves are shown in Table 4.

Comparative analysis with other algorithm classification results

To further explore the classification performance of the automatic classification model of soybean leaf morphology based on knowledge distillation, three commonly used convolutional neural network (CNN)classification models were trained and they were compared with the knowledge distillation-based method.Next, a support vector machine (SVM) was used to classify the blade shape and it was compared it with the classification model based on knowledge distilla-tion. As a traditional machine learning algorithm,SVM needed to use image processing technology to extract the characteristics of soybean leaves. Therefore,traditional image processing technology was used to extract the features of the detected soybean leaves and extract a total of 24 features (e.g., morphology and texture features of soybean leaf images) as the input of SVM. The results showed that the classification accuracy of the support vector machine for soybean leaf morphology was lower than the automatic classification model of soybean leaf morphology based on the knowledge distillation proposed in this paper.Different CNN models were explored (including AlexNet, VGG and DenseNet) with a variety of different hyperparameters and the technique of transfer learning was used to train the model. Although the classification results of AlexNet, VGG and DenseNet models were better than SVM, they were still lower than using the knowledge classification model of soybean leaves based on knowledge distillation. The accuracy of different methods for the classification of soybean leaves is shown in Table 5.

Table 3 Accuracy of different automatic detection methods for classification of soybean leaves

Table 4 Accuracy and FLOPs of different automatic classification methods for classification of soybean leaves

Table 5 Classification accuracy of soybean leaf morphology by different methods

Discussion

To analyze the effect of the FPN-SSD model on the classification effect of soybean leaf morphology and to compare the classification performance and FLOPs of the classification model based on knowledge distillation, this study first compared the classification accuracy of the same classification model with or without the FPN-SSD model. Then, the classification accuracy and FLOPs of the classification model were compared with and without knowledge distillation training. Finally, the accuracy of the classification of soybean leaf morphology was compared with the classification accuracies of other methods.

This study showed that the FPN-SSD model could accurately detect the top leaflet in the three-leaf compound leaves of soybean leaves. By employing the proposed FPN-SSD model, the performance of soybean leaf morphology classification was improved apparently. Although traditional image processing method, i.e., the marked watershed segmentation(Meyer and Beucher, 1990), could be used to extract individual soybean leaves, but the detection result was greatly affected by the placement position of the leaves and cannot intelligently identify the top leaflet in the three-leaf compound leaves of soybean leaves,and hence would deteriorate the classification.

In addition, if the student network was used directly without knowledge distillation, although the number of FLOPs was small, its ability of extrication of the deep features from soybean leaf images was weak and thus the network would achieve a low classification accuracy. If the teacher network was directly used without using knowledge distillation, although the classification accuracy was high; however, due to the large number of network layers, the FLOPs and parameters were large, and the classification speed was low which thus could not be applied into practice.

Based on the experimental results, the analysis and study of the phenotypic trait of other crops can be conducted with the proposed model in the future work. Furthermore, to achieve a real-time phenotypic trait analysis of field crops, the proposed model can be integrated with the technique of the Internet of Things, which will further facilitate the application of germplasm genes in modern agriculture.

Conclusions

To accurately and quickly use computer vision to classify the morphology of soybean leaves, the characteristics of soybean leaf morphology and images were analyzed. In this paper, FPN-SSD was used to detect soybean leaves and a leaf morphology classification model based on knowledge distillation was proposed.Compared with other methods of automatic image classification, the top leaflet in the three-leaf compound leaves of soybean leaves was first detected to effectively avoid the influence of noise. Then, the classification model was used based on knowledge distillation to classify soybean leaf morphology, which not only had an excellent classification accuracy,but also a small number of FLOPs. Compared with the three CNN image classification algorithms AlexNet, VGG and DenseNet and one machine learning algorithm SVM, it could be found that the method proposed in this paper effectively reduced the FLOPs and better extracted the deep features of soybean leaves. In terms of the accuracy and FLOPs of soybean leaf morphology recognition, the method comprehensively utilized the ability of a large network to extract deep features and the advantage of the relatively small number of FLOPs of a small network, which improved the speed and accuracy of soybean leaf shape recognition. The final classification accuracy was 0.956, which was higher than other classification methods, and the small number of parameters and FLOPs could meet the actual needs.This method would help agronomists better study soybean germplasm genes and further study the function of soybean leaves.