APP下载

A Review on the Application of Deep Learning Methods in Detection and Identification of Rice Diseases and Pests

2024-03-12XiaozhongYuandJinhuaZheng

Computers Materials&Continua 2024年1期

Xiaozhong Yu and Jinhua Zheng

1College of Computer Science and Technology,Hengyang Normal University,Hengyang,421002,China

2Hunan Provincial Key Laboratory of Intelligent Information Processing and Application,Hengyang Normal University,Hengyang,421002,China

ABSTRACT In rice production,the prevention and management of pests and diseases have always received special attention.Traditional methods require human experts,which is costly and time-consuming.Due to the complexity of the structure of rice diseases and pests,quickly and reliably recognizing and locating them is difficult.Recently,deep learning technology has been employed to detect and identify rice diseases and pests.This paper introduces common publicly available datasets;summarizes the applications on rice diseases and pests from the aspects of image recognition,object detection,image segmentation,attention mechanism,and few-shot learning methods according to the network structure differences;and compares the performances of existing studies.Finally,the current issues and challenges are explored from the perspective of data acquisition,data processing,and application,providing possible solutions and suggestions.This study aims to review various DL models and provide improved insight into DL techniques and their cutting-edge progress in the prevention and management of rice diseases and pests.

KEYWORDS Deep learning;rice diseases and pests;image recognition;object detection

1 Introduction

As one of the most widely cultivated staple foods,rice is prone to pests and diseases,which cannot be easily detected in the early stage.Traditionally,artificial and machine recognition methods have been employed for rice diseases and pests.Artificial recognition relies on a high professional level and rich experience,wherein people observe and perform classification with their own eyes.However,when large-scale pests and diseases are discovered,they cannot be dealt with promptly,resulting in severe economic losses.In contrast,machine recognition methods usually adopt traditional image recognition methods,such as support vector machine(SVM)[1],artificial neural networks[2],genetic algorithm[3],K-means clustering algorithm[4]and k-nearest neighbor[5].These methods have some limitations,for example,determining similar features in different lighting conditions is challenging.In current research and industry,more precise efficiency,accuracy and application scenarios are required.Additionally,the detection samples of rice pests are acquired by way of expelling or trapping with insecticidal lamps in the field and then using statistical methods to estimate the number of pests;however,the accuracy and real-time performance of these methods are lacking,which is unconducive for realizing rapid pest control.Moreover,in the real-world environment,the detection and recognition of pests and diseases are influenced by factors such as lighting conditions,disease stages,regional distribution,and feature colors.Thus,the traditional methods do not exhibit good performances.In terms of rice diseases,the onset symptoms are apparent and exhibit regional characteristics,leading studies to focus on identifying individual crop diseases and conducting regional disease detection and warning.The field sampling and collection of pests for detection and recognition are concerns due to the migration and concealment of field pests.

Recently,deep learning (DL) technology has been rapidly developed for image recognition and detection,such as face recognition[6,7],object detection[8,9],image segmentation[10,11],and image translation[12,13].Furthermore,numerous DL methods have been employed to detect and recognize rice diseases and pests.In practical production,researchers have explored appropriate model design theorems [14,15],combining hardware and software methods,to improve accuracy and reduce time costs.The utilization of DL for detecting rice diseases and pests not only holds considerable academic research significance but also provides a wide range of potential market applications.Survey showed that relevant studies mainly focused on crop diseases and pests,and the prevention and treatment of diseases and pests in rice,as one of the world’s highest producing crops,during the production process has been focused on.Recently,DL technology has exhibited excellent performance in rice pests and diseases research,but the reviews of the work in this field are lacking and existing reviews do not reflect the latest relevant research[16–18].Thus,this study aims to comprehensively review the applications of DL in rice diseases and pests to provide guidance to scholars.

The rest of paper mainly focused on,introducing publicly available datasets and data preprocessing methods in Section 2.Moreover,the DL-based detection and recognition methods for rice diseases and pests are reviewed in Section 3;the detection and recognition performance of existing DL models are compared and analyzed in Section 4;and the challenges and future ideas are explored in Section 5.Finally,the entire research on rice diseases and pests is summarized.

2 Rice Plant Disease and Pest Datasets

Three data sources exist for agriculture:self-collected,network-collected,and public datasets[19].The images are generally captured by drone aerial camera,mobile phone,digital camera,etc.Alternatively,sensors and photosensitive devices obtain spectral and infrared image data.Thus far,compared to the datasets in computer vision,such as ImageNet,COCO,and PASCAL-VOC2007/2012,fewer public datasets of rice diseases and pests exists.Table 1 lists some pest and disease datasets in the agricultural field,and Table 2 presents the crop types corresponding to the datasets.

Table 2: The comprehensive crop types and features of datasets

Crop pest and disease datasets collected in natural environments are highly practical.The IP102 dataset is taken as an example.It contains 14 kinds of rice pests in a total of 8417 samples;the number distribution is shown in Table 3,and the plot of each pest is shown in Fig.1.The table and figure show that the most common pests are the Rice Leaf Roller and Asiatic Rice Borer.Moreover,the rarest pests are the Paddy Stem Maggot and Grain Spreader Thrips,which is related to the situations that may be encountered in actual rice production.Therefore,the training accuracy of a DL model for samples with a non-uniform distribution needs to be ensured.The sample size of publicly available datasets on rice diseases and pests is very scarce,and the image resolution needs to be improved.However,DL methods are data-driven,and insufficient data will lead to overfitting during network training.Therefore,more high-resolution image samples need to be obtained and high-quality training samples need to be generated using the limited raw data,which has become an urgent problem that needs to be solved.

Table 3: The sample distribution of the IP102 Dataset

Figure 1:Illustration of samples in IP02

In addition to capturing more samples with specialized equipment,one processing method is the expansion of existing samples,which is known as data augmentation.When the amount of raw sample is small in a DL task,new images can be generated by techniques such as flip,rotation,scale,cropping,shift,Gaussian noise,and contrast changes.Recently,methods such as generative adversarial networks[35,36],sample matching[37],counterfactual reasoning[38,39],and automatic augmentation[40]have emerged that can effectively solve the problem of insufficient training samples,and thereby,the pest and disease datasets can be expanded.As an example,the data of the ESRGAN network is enhanced[41],as shown in Fig.2.Fig.3 provides some examples of the data augmentation of rice diseases and pests.

3 The Application of Deep Learning in Detection and Recognition of Rice Diseases and Pests

This section mainly reviews DL-based detection techniques for rice diseases and pests.Compared to computer vision tasks,DL methods perform the detection of rice diseases and pests by extracting features using different model designs.Based on the structure of different networks,the research can be divided into four categories: image classification networks,image detection networks,image segmentation networks,and networks integrating attention mechanisms.Additionally,an overview of meta-learning methods in the context of detecting and recognizing rice diseases and pests is provided.Fig.4 shows the framework of the method for detecting and identifying rice diseases and pests based on deep learning.

Figure 2:The framework of ESRGAN

Figure 3:Data augmentation of rice pest images

Figure 4:The overall of detecting and identifying rice diseases and pests based on deep learning

3.1 Image Classification Network

Image classification involves classifying image categories.It includes image preprocessing,feature extraction,and classifier design.Image classification methods based on DL autonomously learn features from training samples through neural networks,extracting high-dimensional and abstract features closely related to the classifier,making it an end-to-end method.Owing to convolution neural networks(CNNs)such as AlexNet[42],VGGNet[43],GoogleNet[44],ResNet[45],DenseNet[46],MobileNet [47],ShuffleNet [48],and EfficientNet [49],CNNs have become the most commonly employed method for feature extraction with rice disease images.Most existing studies employed the above classic networks as the classification network for images of rice diseases and pests,and some design network structures are based on practical problems.The basic structure of the classification recognition model is shown in Fig.5.When an image to be recognized is inputted into the network,the training network model returns the classification label corresponding to the image.

Figure 5:The structure of recognition model based on CNN

For the design of the rice pest recognition network,Yang et al.[50]utilized the transfer learning method.They employed the VGG16 pre-training model and identified six rice pests,including rice leaf borer,rice planthopper,dichemical borer,trichemical borer,rice locust,and rice weevil.Their accuracy rate was 99.05%.Zeng et al.[51]used ERSGAN to enhance the rice image data,solving the issues of low resolution and low information.Furthermore,they proposed a model called SCResNet based on the ResNet and applied it to the mobile end.Their accuracy for seven rice pest identification tasks reached 91.2%.Bao et al.designed a lightweight residual network-based method for identifying rice pests in natural scenes,named LW-ResNet[52],which focuses on rice pests in natural scenes.The model effectively extracted the deep global features of rice pest images by increasing the convolutional layers and branches to improve the residual block.They designed a lightweight attention submodule to focus on local discriminative features of pests.LW-ResNet achieved recognition accuracy of 92.5%on the test dataset of 13 rice pest images.

Especially,to solve the problem of DL models in mobile deployment and resource-constrained environments,various lightweight network architectures have been proposed,such as deep separable convolution and group convolution,representing models comprise MobileNet,ShuffleNet,Efficient-Net,etc.Figs.6 and 7 display the basic structures of ShuffleNet and EfficientNet,respectively.ShuffleNet achieves high efficiency by employing channel shuffling,group convolution,bottleneck building blocks,and multiple shuffle units.Different versions are available to strike a balance between performance and model complexity [53].EfficientNet accomplishes performance improvement by compound scaling,efficient building blocks,and model variants for different needs.It has become a popular choice for transfer learning in computer vision tasks.In rice pest identification and detection tasks,Chen proposed a novel network architecture named Mobile-Atten[54],which employs MobileNet-V2 as the backbone and has an attention mechanism for learning the importance of interchannel relationships.Mobile-Atten has been tested on a rice disease dataset captured from real-life agricultural fields and online sources.It achieved 98.48% accuracy in terms of rice plant disease identification under complicated backdrop conditions.To improve the rice disease classification accuracy,Zhou proposed a novel DL model called GE-ShuffleNet[55],which is easier to deploy with fewer Params and smaller model size compared to other models,such as ShuffleNet V2[53],AlexNet,and VGGNet.Experiments on four rice leaf diseases showed that the identification accuracy of GEShuffleNet reached 96.65%.Furthermore,Nguyen et al.provided a new dataset of rice leaf diseases[56],which contains 13106 rice leaf images,including 7 common diseases.They used the RMSprop and Adam optimization algorithms with the EfficientNet-B4 model to evaluate the dataset,achieving the highest classification accuracy of 89%.

Figure 6:The basic unit structure ShuffleNet

Figure 7:The basic unit structure of EfficientNet

In summary,the methods based on image classification networks are widely used.They identify rice diseases and pests by categorizing entire images into predefined classes,such as “healthy rice”or “rice with diseases/pests”.These networks are trained on labeled datasets,where each image is associated with a specific class label.During inference,the models analyze a given image and assign it to the most likely class based on learned patterns and features.The strengths of image classification networks include simplicity,efficiency,and ease of training,which makes them suitable for tasks where a binary or multi-class decision is sufficient.However,their primary limitation is their inability to provide detailed spatial information.They cannot pinpoint the exact location or extent of diseases or pests within an image,limiting their usefulness in precision agriculture applications that require precise localization for targeted interventions.Additionally,they struggle with images containing multiple issues or overlapping instances of diseases and pests as they assign a single label to an entire image.Table 4 summarizes some commonly used models for detecting rice leaf diseases and pests.

Table 4: Application of image classification network in rice diseases and pests

3.2 Object Detection Network

DL-based methods are used to determine the area and category of a target.Recently,many seminal target detection algorithms have been proposed.One type is the R-CNN algorithms represented by Faster-RCNN[63]and Cascade-RCNN[64],which usually first extract the target candidate boxed area and then regress and classify the candidate area.These methods require two steps to obtain the final detection results and are hence called two-stage algorithms.Another type uses YOLO[65,66],single shot multibox detector(SSD)[67],RetinaNet[68]and other one-stage algorithms to directly regress the detection frame of the target.Although the accuracy of such methods is lower than that of the two-stage algorithms,the detection speed is fast.In contrast,anchor-free algorithms,like CenterNet[69]and CornerNet[70],transform the regression of target-detection-anchor to a key-point detection problem.Simultaneously,such algorithms also bring new thinking directions,representing the trend of mutual reference between object detection and other machine learning fields.Detection algorithms of rice diseases and pests based on DL provide excellent technical support in the research on pest and disease detection in rice production.Table 5 illustrates the applications of object detection networks for rice diseases and pests.

Table 5: Application of object detection network in rice diseases and pests

For the detection on rice diseases and pests in one stage,She et al.[71] proposed using feature pyramids to improve the SSD multi-scale feature map,improving the recognition rate and providing better convergence for small targets.They improved the recognition rate and detection speed of five types of pests:Chilo suppressalis,stem borer,rice planthopper,rice locust,and diamond.Yao et al.[72]improved RetinaNet by normalizing and optimizing the FPN structure,identifying the damage status of rice leaf roller and stem borer,with an average detection accuracy reaching 93.76%,which is better than the results of feature extraction networks using VGG and ResnetNet101.One study[73]proposed a smart mobile-phone automated system to detect rice leaf diseases.The system is based on the Yolov4 model and is mainly applicable for brown spots,leaf blasts,and hispa,with a recognition efficiency of 97.36%.The system can push suitable governance plans to farmers.References[74,75]innovatively employed Yolov5 for rice leaf disease recognition and constructed a detection system,yielding good recognition results in model accuracy,recall rate,and mAP.Fig.8 displays the one-stage model.

Figure 8:The model structure of one-stage

For two-stage rice diseases and pest,Rahman et al.[76] proposed a two-stage small-scale CNN model for rice disease detection focusing on model size issues.They tested the model on 1426 images of rice diseases and pests collected from the paddy fields of Bangladesh Rice Research Institute and achieved an accuracy of 93.3%.A two-stage method called RiceNet[77]was proposed to identify four important rice diseases:rice panicle neck blast,rice false smut,rice leaf blast,and rice stem blast.In the first phase of RiceNet,the YoloX algorithm detects the rice leaf disease to reconstruct the dataset.In the second phase,the Siamese network prevents overfitting and improves the identification accuracy by directly identifying limited annotated rice disease patches.Reference[78]optimized cascaded R-CNN using feature pyramid FPN,soft non-maximum suppression,and ROI Align calibration,effectively improving the detection and overlapping target recognition ability of small targets.Five types of rice pests including rice planthopper,rice grasshopper,black-tailed leafhopper,mole cricket,and rice armyworm have been tested on optimized cascaded R-CNN and achieved an accuracy of 94.15%.Bari et al.[79] proposed a region-based CNN (Faster R-CNN) for the real-time detection of rice leaf diseases to improve the diagnostic accuracy.Combined with the RPN architecture,their model accurately located the position of leaf diseases,generating candidate regions for diagnosing rice blast,brown spot,and hispa with accuracy rates of 98.09%,98.85%,and 99.17%,respectively.Additionally,the model identified healthy rice leaves with an accuracy of 99.25%.The two-stage model is shown in Fig.9.

Figure 9:The model structure of two-stage

For anchor-free-based rice diseases and pests,Yao et al.[80] proposed an automatic detection algorithm for light-induced planthoppers based on CornerNet.They focused on the small proportion of rice planthoppers in images of light-induced insects,using the overlapping sliding window method to improve the ratio of rice planthoppers in image detection and removing redundant detection boxes through the detection box suppression method.They effectively enhanced the detection effect of rice planthoppers in light-induced insect images.To prevent the same insect from being repeatedly counted under the same posture,Lin et al.[81] proposed a detection and recognition method that combines image redundancy elimination with a CenterNet network.Through truncation thresholding processing,bilateral filter,and redundancy elimination operations,they solved the problem of duplicate detection of similar images,guiding the early warning of rice planthoppers and prediction of population density.Fig.10 depicts the anchor-free stage model.

Figure 10:The model structure of anchor-free

To sum up,detection networks identify and locate instances of rice diseases and pests by drawing bounding boxes around affected areas in images.They learn from annotated datasets where these bounding boxes are specified.In contrast to classification networks,which determine whether a particular issue is present,detection networks provide spatial information about the problems’positions within the images.However,they have limitations compared to classification networks.They excel at localization but lack details about the extent and severity of the issues.They also struggle with overlapping cases and crowded images,potentially missing some instances.Furthermore,like segmentation networks,they require annotated data with bounding box information for training,which can be resource intensive and costly.

3.3 Image Segmentation Network

Image segmentation involves dividing an image into multiple regions by identifying areas with similar or identical features,allowing for the extraction of key information and the removal of non-interesting regions.The commonly used image segmentation algorithms in rice diseases and pests images include threshold segmentation [82,83],edge detection [84],clustering segmentation[24,85],and deformable shape segmentation[86].These methods have the disadvantages of abundant computation and poor robustness,and they cannot effectively extract image features.With the rise of deep neural networks,semantic segmentation network models have been proposed,such as FCN[87],U-Net [88],SegNet [89],PSPNet [90],Mask R-CNN [91],and Segformer [92].These models are currently widely applied in the field of rice disease and pest detection and recognition,and they effectively address the issues of noise and nonuniformity in images.Table 6 lists the relevant applications of image segmentation methods in rice diseases and pests,and Fig.11 presents the semantic segmentation basic unit.

Table 6: Application of image segmentation method in rice diseases and pests

Figure 11:Semantic segmentation network model

Feng et al.[93]proposed a real-time segmentation method based on feature fusion and attention mechanism for the severity of rice blast disease,named DFFANet,which includes a feature extraction module,a feature fusion module,and a lightweight attention module.It effectively realizes the shallow and deep feature extraction of rice blast and fuses the features extracted at different scales.The model achieved 96.15% accuracy in rice blast spot segmentation.Gong et al.[94] introduced a new encoder–decoder and a series of sub-networks connected by jump paths in the FCN network,named FCA-ECAD,combining long jump and fast connection to realize accurate and fine-grained insect boundary detection.The network constitutes the conditional random field module for insect contour thinning and boundary location.The FCA-ECAD model achieved 98.28% accuracy on 10 rice pest segmentation and classification tasks.Oddy et al.[95] proposed a semantic segmentation model for rice leaf blast and pest images based on the U-Net architecture,with parameters adjusted through three optimization methods:HyperBand,random search,and Bayes.Daniya et al.[96]used segments to extract statistical,CNN,and textural features.Furthermore,the proposed algorithm,named RideSpider Water Wave,was used to train Deep RNN and generate optimal weights.The accuracy of the proposed algorithm for the identification of brown spot,rice blast and bacterial leaf blight was 90.5%.Reference [97] proposed a lightweight network based on copy–paste and semantic segmentation,and they collated a dataset for major rice disease segmentation to enhance the collected disease samples,including rice bacterial blight,rice blast,and brown spot.By replacing the backbone network with a lightweight semantic segmentation network Segformer,combining attention mechanisms,and changing upsampling operators to train a new RSegformer model,the balance between local and global information was improved,the training process was accelerated,and network overfitting was reduced.Zhang et al.[98]proposed an improved Mask R-CNN method for identifying rice diseases.By changing the feature fusion process of the feature pyramid to bottomup and incorporating multi-scale expansion convolution,they achieved good recognition results on the rice bacterial blight dataset.

In general,segmentation networks identify rice diseases and pests by precisely delineating and labeling affected regions at a pixel level in rice field images.These networks are trained on annotated datasets where each pixel is assigned a specific class,such as healthy rice,diseased,or pest-infested regions.During inference,segmentation models analyze images and output detailed masks,highlighting the exact locations and extents of rice diseases and pests.This level of granularity provides valuable insights for farmers and researchers,facilitating targeted interventions and improved crop management.However,segmentation networks have some limitations compared to classification and detection networks.They require extensive pixel-level annotation,which is time consuming and costly.Moreover,segmentation models tend to be computationally more intensive,making real-time applications challenging in resource-constrained settings.Additionally,they struggle with complex or overlapping instances of diseases and pests within an image.Despite these challenges,segmentation networks excel in providing fine-grained information critical for precise agricultural decision making.

3.4 Fusion Attention Network

The proportion of rice diseases and pests in crop images is often relatively small,making it difficult to observe with the naked eye.Additionally,although many images are carefully processed,the recognition accuracy may be low due to factors such as camera angle,distance,and complex background during shooting.To solve these problems,the emergence of attention mechanism has attracted widespread interest.The attention mechanism was proposed by Bahdanau et al.[99]and has recently been widely used in various fields such as DL.A new network structure,transformer[100],is entirely composed of attention mechanisms.A standard transformer comprises an encoder and a decoder.The encoder includes a self-attention layer and a feed-forward neural network,while the decoder includes a self-attention layer,an encode–decode attention layer,and a feed-forward neural network.Subsequently,with the widespread success of transformer networks in natural language processing problems,many variants of transformer have emerged to solve computer vision problems,among which ViT Transformer[101],DeiT[102],TNT[103],Swin transformer[104]are transformerbased image classification models;DERT [105] is a transformer-based object detection model;and SETR [106] is a transformer-based semantic segmentation model.Studies have shown that the incorporation of attention mechanisms can improve the pest feature extraction and accuracy [107–110].The structure of a typical ViT is presented in Fig.12.Firstly,the input image is segmented into non-overlapping patches of fixed sizes,then,these patches are flattened,and positional embedding is applied through linear projection.The primary purpose of positional embedding is to preserve the spatial information of the patches in relation to the original image.Following this,the resulting output vector is fed into a series of N transformer blocks for further processing.Table 7 lists the relevant network models for identifying rice diseases and pests that employ attention mechanisms.

Figure 12:The basic structure of ViT Transformer

Table 7: Models for rice diseases and pests based on attention mechanism

Zhou et al.[111]proposed a residual distillation transformer architecture to rapidly and accurately recognize rice diseases and pests in images.They used visual and distillation transformers as residual modules for extracting key disease features and fed into the MLP layer for prediction.This work marked the pioneering application of transformer models in the field of rice disease recognition.Experimental results on four rice leaf diseases achieved 89% F1-score and 92% top-1 accuracy.Yang et al.[59]developed a lightweight network called VGG-DS,which is suitable for mobile devices.This model incorporates SE attention modules to enhance feature extraction capabilities and achieved an accuracy of 93.66% on nine different rice disease detection tasks.Wei et al.[112] introduced the lightweight convolutional block attention module [118] to improve the mobile inverted bottleneck convolution of the main module in EfficientNet-B0,named CG-EfficientNet,and they used the Ghost module to optimize the convolution layer in the network to reduce the number of network parameters.Finally,they employed an adam optimization algorithm to improve the network’s convergence rate.The proposed model achieved an accuracy of 95.63% for the classification of five rice leaf diseases:rice bacterial blight,rice kernel smut,rice smut,rice flax spot,and healthy leaves.Zhang et al.[113]proposed a rice disease identification method based on a swin-transformer,including sliding window operation and hierarchical design,which limits the attention calculation to each window and reduces the computational complexity.The model effectively classified five rice diseases (i.e.,rice stripe,rice blast,rice false smut,rice brown spot,and rice sheath light) with an accuracy rate of 93.4%.Ma et al.[114] proposed a DeiT feature encoder-based algorithm for identifying disease types and generating relevant descriptions of rice crops.The model achieved 87.67% accuracy on the Rice2k dataset.Furthermore,a vision transformer enabled Convolutional Neural Network model called PlantXViT is proposed for plant disease identification [115].The proposed model combines the capabilities of traditional convolutional neural networks with the vision transformers to efficiently identify a large number of plant diseases for several crops.The average accuracy for recognizing five rice diseases is shown to exceed 98.33%.Liu et al.[116]proposed a dual-path attention capsule network based on CapsNet,named MDACapsNet,to address the issue of low accuracy in identifying rice pests with variable positions and postures using existing methods.MDACapsNet comprises an encoding module,a reconstruction module,and a classification module.The attention mechanism is mainly used for the encoding module,while the multi-scale dual attention module and local shared dynamic routing algorithm are used to improve the feature extraction ability and reduce the computations.An accuracy rate of 95.31%was achieved during recognition experiments on 14 rice pests.Similarly,the attention mechanism capsule network technology has been also used in reference [117],which introduced a convolutional attention model that combines spatial attention and channel attention mechanisms into capsule networks,enabling the model to focus on crucial features.They achieved an accuracy of 99.19%in the recognition of five different rice pests in complex environments.

Attention networks,such as self-attention mechanisms and transformer models,identify rice diseases and pests by dynamically emphasizing relevant features within an image while downplaying less important areas.They learn to focus on specific regions of interest,like diseased plants or pestinfested areas,by assigning different attention weights to different parts of an image.This adaptability makes them powerful tools for detecting and localizing issues within rice field images.Their strengths include the ability to capture complex relationships between image elements and adapt to varying problem sizes and shapes.However,their limitations include increased computational demands,especially for large-scale images,and the need for substantial labeled data for effective training.Additionally,their interpretability is challenging,making it harder to understand the reasoning behind their predictions.Nonetheless,attention networks offer promising capabilities for fine-grained analysis of rice diseases and pests.

3.5 Few-Shot Network

Image recognition and object detection techniques in DL help to accurately predict and locate pests in farmland images.However,a dataset with sufficient samples is required,and due to the wide variety of pests,collecting thousands of training images for each sample is impractical.To address this issue,small sample learning and meta-learning have received widespread attention[119–122],and Fig.13 displays the model architecture.The first task-driven meta-learning small sample classification work in the agricultural field was conducted by reference [123].They introduced an intuitive taskdriven learning scheme and collected a balanced database covering pests and plants from publicly available resources.Through extensive comparison and experimental analysis of N-way K-shot and domain shift,they provided reference and benchmark for subsequent research on the application of small sample learning in the agricultural field.

To solve the problem of poor generalization and dependence on a large amount of data in DL algorithms,Wang et al.proposed a small sample classification method called IMAL [120] for plant diseases.Using the model-independent meta-learning method with strong generalization ability as the overall framework,they proposed a new soft center loss function to enhance the ability of the model to distinguish features.Moreover,they used the PRelU Activation function to enhance the model fitting ability.Compared to three advanced few-shot learning methods,IMAL exhibited better classification performance on the PlantVillage dataset,especially in small sample situations.Some studies have been conducted on identifying rice diseases and pests using few-shot learning.Pandey proposed a meta-learning technique for rice pest detection based on few shot[30],including IP102 and ICAR-NBAIR datasets,wherein IP102 serves as the supporting dataset for performing meta-learning,while ICAR-NBAIR is used for performing few-shot learning.Selecting 14 types of rice pests from the IP102 dataset,the proposed model was evaluated using the training methods of 14 way-3 shots and 14 way-5 shots.

Figure 13:Few-shot network structure

Few-shot networks identify rice diseases and pests using a small number of labeled examples to make predictions and to adapt to new and unseen cases.They are trained on a wide range of tasks,including rice disease and pest detection,to learn versatile feature representation and improve their adaptation ability.When presented with a new rice disease or pest,they can quickly adapt and make accurate predictions based on the limited labeled examples available.Advantageously,they can tackle data scarcity,which is common in agriculture,and have the potential to generalize novel problems.However,they require a substantial amount of pre-training data,there is a risk of overfitting to the few-shot tasks,and their interpretability may be challenging due to their complex architectures.Nevertheless,few-shot learning networks provide a promising approach for effective and efficient rice disease and pest detection,particularly in situations with limited labeled data.

4 Performance Comparison

Four metrics are used to evaluate the detection and recognition of rice diseases and pests:Accuracy,Precision,Recall,and F1-score (F1).Before introducing them,several symbols must be explained,which have their own worth in both classification and detection tasks.

• TP (True Positive) means correctly classifying things in classification tasks or the number of correct detections in detection tasks.

• FP(False Positive)represents something incorrect classification or the number of misclassified,non-confirming bounding box coordinates in the predicted bounding box.

• FN (False Negative) represents another incorrect detection.It refers to instances where the model fails to identify something that does belong to a particular category.

• TN (True Negative) whose meaning is opposite to TP,represents the model can correctly identify instances as not belonging to a particular category.

Based on the above,the Accuracy metric is defined as the proportion of the correct predicting ones in all samples,as shown in Eq.(1).

Precision is specific to prediction results,representing how many of the predicted positive samples are truly positive and can reflect the correctness of a category’s prediction,as shown in Eq.(2).

Recall represents the ability of the model to find all relevant targets,that is,the maximum number of real targets that can be covered by the predicted results provided by the model,as shown in Eq.(3).

F1 is a harmonic average based on accuracy and recall,as shown in Eq.(4).

Finally,there are three metrics introduced to improve the effectiveness of evaluating a model,as shown in Eqs.(5)–(7).

where Precision is the precision in images,and Recall is the predicted correct ratio in all positive samples in images,n denotes the number of categories.

The evaluation results of the pest detection and identification models in literature were compared.Tables 8 to 10 present the performance evaluations of different image classification models,object detection models,and image segmentation network models on rice disease and pest identification.Taking the rice pest dataset in IP102 as an example,the ARGAN network was used to enhance the data and the results of ResNet,VGG16,and MobileNet were compared based on the above indicators.Table 8 lists the accuracy results of different models.In Table 9,one-stage,two-stage,and anchorfree algorithms are compared in terms of performance.Finally,based on the semantic segmentation network model,the performance of mainstream algorithms in rice diseases are compared in Table 10.

Table 8: Performance comparison of image classification models on the same rice pest datasets2All the data from the reference[32]and collect from field.

Table 9: Performance comparison of object detection models on different rice pest datasets

Table 10: Performance comparison of image segmentation models on the same rice disease datasets3All the data from the reference[98]and collect from field.

With the continuous development of DL,the application performance of some typical algorithms on different datasets of rice diseases and pests has gradually improved and the accuracy,mAP,F1-score,and other indicators of the algorithms have also improved,yielding good results.Due to the lack of an open and comprehensive dataset of rice diseases and pests that allows for a unified comparison of all algorithms,the complexity of rice pest images in existing research still needs to catch up to realtime pest and disease detection and recognition algorithms based on mobile devices.Therefore,the dataset and algorithm performance need to be improved in future studies.

5 Challenges and Future Directions

Although deep learning has achieved significant results in detecting and recognizing rice diseases and pests,it also faces some unresolved challenges,mainly in the following aspects.Meanwhile,some potential solutions are mentioned.

(1) Data acquisition

This is an expensive and time-consuming task to obtain large-scale annotated image data of rice pests and diseases.In the processing of dataset production,some rice pests and diseases may occur less frequently,resulting in category imbalance.Then,the light and environmental conditions between rice fields may vary depending on the location and season,which can affect the quality and characteristics of the image.Although some rice disease and pest datasets are publicly available,the quality varies,making it essential to collect data from different regions and rice varieties to create a more representative dataset.

From perspectives of human,future advancements in data research involve collaborative efforts to collect data through crowdsourcing platforms or farmer cooperatives.Additionally,the utilization of high-resolution sensors like UAV and smartphones for data acquisition is pivotal.Simultaneously,fostering data sharing and cooperation among diverse research institutions is crucial to collectively establish larger-scale datasets.From perspective of technology,future data research involves addressing challenges in rice disease and pest detection.This includes utilizing data augmentation and transfer learning techniques to address the limited availability of annotated data for the detection of rice diseases and pests.Furthermore,it aims to mitigate the issue of class imbalance in rice disease and pest categories through resampling and loss function adjustments.Simultaneously,data preprocessing methods like image normalization can be employed to reduce the influence of varying environmental conditions.

(2) Models for deep learning

Deep learning models often require large amounts of computing resources and high-performance hardware,and are considered black-box models,making it difficult to explain their decision-making process.Another issue that needs to be mentioned that a model trained on one region or rice variety may not necessarily generalize well to others.It can be challenging to design deep learning models for the recognition and detection of rice diseases and pests.

In future researches,the utilization of transfer learning techniques can involve starting with pretrained models that excel in related domains,such as plant disease detection,and fine-tuning them to adapt to the task of rice disease and pest detection.Alternatively,AutoML (Automatic Machine Learning) techniques can be employed to search for the optimal architecture within deep learning networks tailored to the requirements of rice disease and pest detection.Furthermore,combining different types of deep learning models,such as convolutional neural networks and recurrent neural networks,can yield improved performance.

(3) Practical applications in rice crop fields

The complete process of using deep learning for identifying and detecting rice diseases and pests includes data collection,data preprocessing,model training,model evaluation,deployment,continuous monitoring and improvement.Despite achieving satisfactory accuracy in many researches,there are doubts about the feasibility of deploying these models on distributed systems and terminal devices.On one hand,these models tend to be overly large,and practical field devices often lack the necessary resources to support prolonged model execution.On the other hand,achieving network coverage in the agricultural production process is currently challenging,making model updates and online learning a manual process,undoubtedly increasing labor costs.

In future research focused on practical applications in rice fields,promising areas of investigation involve feature engineering,the development of lightweight models,the exploration of incremental learning,local decision-making,and the establishment of collaborative networks.Specifically,firstly,the fusion of multimodal rice data including visible light images,infrared images,and multispectral images,can be explored to enhance detection accuracy by combining data from different sensors.Secondly,it is a crucial to construct the lightweight deep learning models can reduce model size and computational complexity.Furthermore,implementing incremental learning allows the model to gradually adapt to new data and types of rice diseases and pests and allowing models on terminal devices to make local decisions can reduce communication costs.Lastly,by establishing collaborative networks that enable multiple devices to share model updates reduces data transfer and lightens the workload for each device.

6 Conclusion

Manual detection of rice diseases and pests is often time-consuming,labor-intensive and requires specialized knowledge.The high accuracy and reliability of deep learning techniques can help farmers,agricultural experts,and government departments better understand the species,distribution,and severity of diseases and pests.This paper reviews the relevant applications of deep learning in rice pest detection and recognition in recent years,including image classification,object detection,semantic segmentation,attention mechanism and small-sample learning,and summarizes and compares the performance of various methods.Numerous researchers have made remarkable works in deep learning to recognize and detect rice diseases and pests in paddy field.However,the widespread practical implementation remains a challenge.To fully explore the vast development potential and application value of deep learning technology,it is essential for experts from relevant fields to collaborate and integrate their expertise and knowledge in rice crop protection with deep learning algorithms and models.

Acknowledgement:Not applicable.

Funding Statement:This research work is funded by Hunan Provincial Natural Science Foundation of China with Grant Numbers (2022JJ50016,2023JJ50096),Innovation Platform Open Fund of Hengyang Normal University Grant 2021HSKFJJ039,Hengyang Science and Technology Plan Guiding Project with Number 202222025902.

Author Contributions:The authors confirm contribution to the paper as follows: study conception and design: Jinhua Zheng;data collection: Xiaozhong Yu;analysis and interpretation of results:Xiaozhong Yu,Jinhua Zheng;draft manuscript preparation: Xiaozhong Yu.All authors reviewed the results and approved the final version of the manuscript.

Availability of Data and Materials:Not applicable.

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.