APP下载

Artificial Intelligence Based Prostate Cancer Classification Model Using Biomedical Images

2022-08-24AreejMalibariReemAlshahraniFahdAlWesabiSiwarBenHajHassineMimounaAbdullahAlkhonainiandAnwerMustafaHilal

Computers Materials&Continua 2022年8期

Areej A.Malibari,Reem Alshahrani,Fahd N.Al-Wesabi,Siwar Ben Haj Hassine,Mimouna Abdullah Alkhonaini and Anwer Mustafa Hilal

1Department of Computer Science,Faculty of Computing and Information Technology,King Abdulaziz University,Jeddah,21589,Saudi Arabia

2Department of Computer Science,College of Computers and Information Technology,Taif University,Taif,21944,Saudi Arabia

3Department of Computer Science,College of Science&Art at Mahayil,King Khalid University,Saudi Arabia

4Department of Computer Science,College of Computer and Information Sciences,Prince Sultan University,Saudi Arabia

5Department of Computer and Self Development,Preparatory Year Deanship,Prince Sattam bin Abdulaziz University,AlKharj,Saudi Arabia

Abstract: Medical image processing becomes a hot research topic in healthcare sector for effective decision making and diagnoses of diseases.Magnetic resonance imaging (MRI) is a widely utilized tool for the classification and detection of prostate cancer.Since the manual screening process of prostate cancer is difficult,automated diagnostic methods become essential.This study develops a novel Deep Learning based Prostate Cancer Classification(DTL-PSCC)model using MRI images.The presented DTL-PSCC technique encompasses EfficientNet based feature extractor for the generation of a set of feature vectors.In addition,the fuzzy k-nearest neighbour(FKNN)model is utilized for classification process where the class labels are allotted to the input MRI images.Moreover,the membership value of the FKNN model can be optimally tuned by the use of krill herd algorithm(KHA)which results in improved classification performance.In order to demonstrate the good classification outcome of the DTL-PSCC technique,a wide range of simulations take place on benchmark MRI datasets.The extensive comparative results ensured the betterment of the DTL-PSCC technique over the recent methods with the maximum accuracy of 85.09%.

Keywords: MRI images;prostate cancer;deep learning;medical image processing;metaheuristics;krill herd algorithm

1 Introduction

Prostate cancer is one of the common forms of cancer that is accountable for 26% of cancer diagnoses for American men[1].So far,prostate cancer is detected by systematic biopsies that contain millions of specimens taken from the prostate through a core needle.A systematic biopsy is invasive and has lower sensitivity;furthermore,it creates a possibility of bleeding,infection,and sepsis[2].A non-invasive imaging technique to diagnose prostate cancer at an earlier stage might enhance prostate cancer treatment and diagnosis.For a new ultrasound technique,the Contrast Enhanced Ultrasound(CEUS) could offer suitable modality to visualize the dynamic pattern of the blood flows,allowing clinical experts to diagnose angiogenesis for cancer detection[3,4].Until now,numerous studies of the CEUS based prostate cancer diagnosis are accomplished by measuring distinct parameters of the time intensity curve(TIC).Machine learning(ML)is a subdivision of artificial intelligence(AI)which is depending on the concept of the system learning patterns from a largescale dataset through statistical and probabilistic mechanisms and also make predictions or decisions on the new information [5].In medical imaging sector,computer-aided diagnosis and detection (CAD),that is an integration of ML classification and imaging feature engineering,has shown promising results in supporting radiotherapists for precise diagnoses,reducing the time and cost of diagnosis[6].

Conventional feature engineering approaches are depending on quantitative imaging feature extraction[7]like intensity,texture,volume,shape,and different statistical features from imaging data as well as ML classifiers like Decision Tree (DT),Support Vector Machines (SVM),and Adaboost.The deep learning(DL)method has shown effective results in different kinds of computer vision(CV)tasks like object-detection,segmentation,and classification.Nonetheless,in order to attain effective implementation,a precise fine-tuning of hyperparameter and optimum structures and combinations of the layer are needed.This remains one of the key challenges of DL-based approaches while employed in distinct sectors like medical imaging[8-10].With convolutional neural network(CNN)promising result in the fields of computer vision(CV),medical imaging researchers have changed their interest towards DL-based approaches to design CAD systems for the diagnosis of cancer.

This study develops a novel Deep Transfer Learning based Prostate Cancer Classification(DTLPSCC) model using MRI images.The presented DTL-PSCC technique encompasses EfficientNet based feature extractor for the generation of a set of feature vectors.In addition,the fuzzy k-nearest neighbour(FKNN)model is utilized for classification process where the class labels are allotted to the input MRI images.Moreover,the membership value of the FKNN model can be optimally tuned by the use of krill herd algorithm(KHA)which results in improved classification performance.In order to demonstrate the enhanced classification outcome of the DTL-PSCC technique,a wide range of simulations take place on benchmark MRI datasets.

The rest of the study is organized as follows.Section 2 offers the related works,Section 3 discusses the proposed model and Section 4 provides the experimental validation.Lastly,Section 5 draws the conclusion.

2 Literature Review

Zhang et al.[11]integrated a GrowCut and Zernik feature extraction and extreme learning machine (ELM) approaches for lesion segmentation in MRI and prostate cancer diagnosis.They utilize GrowCut approach for the segmentation of the suspicious cancer region and the integration of ML models in ensemble learning to diagnose prostate cancer.De Vente et al.[12]developed a neural network(NN)system that grades and detects cancer tissue in end-to-end manner simultaneously.It is medically applicable when compared to the classifier goals of the ProstateX-2 challenge.They utilized the data set for testing and training.Also,employed a two-dimensional U-Net with MRI as input and lesion segmentation map which encodes the Gleason Grade Group (GGG),a measurement for the aggressiveness of cancer,as output.

Ye [13]designed an AI-based method (called AI-biopsy) for the earlier diagnoses of prostate cancer through MRI labelled with histopathology data.The DL method is designed to differentiate 1) higher-risk tumors from lower-risk tumors and 2) benign from cancerous tumors.Alkadi et al.[14]trained a deep convolution encoder-decoder framework for segmenting the malignant lesions,the prostate,and anatomical structure.To integrate the 3D contextual spatial data given by the MRI,we present a three-dimensional sliding window model that preserves two-dimensional domain difficulty when utilizing three-dimensional data.

Feng et al.[15]introduced a DL architecture for diagnosing prostate cancer in the CEUS image.The presented approach extracts feature uniformly from temporal and spatial dimensions by carrying out 3D convolutional operation that captures dynamic data of the perfusion encoded in many adjacent frames.The DL model was validated and trained against expert’s delineation through the CEUS image recorded by 2 kinds of contrast agents.In [16],a strong DL based convolutional neural network(DL-CNN)approach is applied by means of transfer learning(TL)method.The outcomes are compared to several ML approaches.Cancer MRI databases are employed for training ML classifiers and GoogleNet,different features like Entropy based,Morphological,Texture,Elliptic Fourier Descriptors,and Scale Invariant Feature Transform(SIFT)is extracted.

3 The Proposed Model

In this study,an effective DTL-PSCC technique has been developed to classify prostate cancer using MRI images.The proposed DTL-PSCC technique involves several subprocesses namely preprocessing,EfficientNet based feature extraction,FKNN based classification,and KHA based parameter tuning.The membership value of the FKNN model can be optimally tuned by the use of KHA which results in improved classification performance.

3.1 Pre-processing

The ground truth given by PROSTATEx-2 challenge is coordinate point at centre of lesion[17].The region of interest(ROI)of size 65×65 nearby the ground truth has cropped in T2W image and ROI of size 21×21 are collected to remove 2D GLCM features.The size of ROI has been selected then a manual inspection so that the maximum tumor amongst the provided data set is suitable inside the ROI.The Lloyd-max quantization was executed on ROI with the amount of gray level fixed to baseline parameter 32 that minimized mean square error(MSE)to provide the amount of quantization levels.

3.2 Feature Extraction:EfficientNet Model

CNN is a typical DL approach which could produce cutting-edge results for almost all the classification problems [18].CNN achieved good results on image classification,however,it could yield better accuracy on text data.CNN is Mainly utilized for automatically extracting the feature from the input data set,in addition ML method,where the user requirements to elect the feature 2D,and 3D CNN is employed for video and image data,correspondingly,while 1D CNN is applied to text classification.The CNN framework employs a sequence of convolution layers for extracting features from the input data set.Max pooling layer afterward every convolution layer and the dimension of extracted feature is decreased.In the convolution layer,the size of the kernel performs an important role in feature extraction.The model’s hyperparameter represents the kernel size and number of filters.

This layer translates the word into a vector space module based on how frequently words appear closer to another word.The embedding layers use random weight to learn embedding for each term in the trained data set.The softmax layer is utilized as the classification layer that could achieve better results for the multi-class problems.The softmax function contains N units,in which the N represents the amount of units.All the units are connected fully with preceding layer and compute the likelihood of every class on N as follows

WMindicates the weight matrix which connects themthunit to the preceding layer,xdenotes the final output,andbmsignifies themthunit bias.DL is a type of CNN and is extremely utilized in images.In recent times,DL was extremely utilized in analysis of several medicinal diseases.Likewise,a few analyses are developed to analysis of skin disease utilizing DL.The DL has several connected layers utilizing numerous weight as well as activation functions.A fundamental DL comprises a convolution layer,pooling,and connected layers.Many activation functions are utilized for adjusting the weight.The activation function generates a feature map which is input as to succeeding layer.

The pooling as well as convolution layers were utilized to remove the feature.These layers were utilized to remove the visual feature and recognize the difficult nature of image.But,the nature of skin cancer lesions has highly complex,and increasing an automated analysis model utilizing DL was stimulating.For alleviating this issue,TL was employed.Fig.1 illustrates the structure of EfficientNet technique.During the current study,EfficientNetB3 was utilized for skin cancer recognition.An EfficientNetB3 is a recent,cost-efficient,and robust method established by scaling 3 parameters like depth,width,and resolution [19].An EfficientNetB3 method with noisy-student weight has been utilized from scenarios I and III to TL method,but“isicall_eff3_weights”weights were utilized as pretraining to scenarios II and IV.The amount of parameters are decreased.Besides,the rectified linear unit(RELU)activation function was employed with 3 dense and 2 dropout layers.The resultant layer has several outcome units to multiclass classifier utilizing the softmax activation function.

Figure 1:Framework of EfficientNet

3.3 Optimal Fuzzy KNN Based Classification

In this section,the FKNN classifier to detect and classify different classes of prostate cancer is applied.In the FKNN model,the fuzzy membership values of the instances are allocated to distinct class labels as given below[20].

whilei=1,2,...C,j=1,2,...,K,Cdenotes the class count andKrepresents the nearest neighboring count.The fuzzy parameter (m) can be utilized for determining the weighted distance upon determining every neighbor’s influence on the membership values.m∈(1,∞).‖x-xj‖ is commonly chosen as themvalue.Besides,Euclidean distance amongxand its jth nearest neighborsxj,are chosen as the distance measure.Moreover,uijindicates the degree of membershipxjfrom the trained data to classiamongst the k-nearest neighbor(KNN)ofx.Here,the limited fuzzy membership model is used where the KNN of every training data is computed and the membership ofxkin every class gets allocated using Eq.(3):

wherenjdepicts the neighboring count underjthclass.The membership values need to fulfill the succeeding equations.

j=1,2,···,n,Cis the number of classes(4)

Once the membership values are calculated,it gets allocated to the classes with maximum degree of membership,i.e.,

For tuning the parameters involved in the FKNN model,the KHA is applied.Antarctic krill is the main animal species on Earth.The capability to procedure huge swarm is most important feature of this species.An individual krill gets out from the herd once predators namely whales seals attack the krill.This attack decreases the density of krill herd(KH).The restructuring of KH then predation was influenced by several parameters.An essential purpose of herding performance of the krill individuals was improving krill density and attaining the food.The KH technique utilizes this multiobjective herding to resolve global optimized issues.Thus,the outcome,the krill individual transfers near an optimum solution once its searches for maximum density of herd as well as food.This performance generates the KH nearby the global minimal of optimized issue.

The time-dependent place of individual krill from 2Dsurfaces has been led by the subsequent 3 important performances[21].

1.Effort induced by another krill individual;

2.Foraging motion

3.Physical or random diffusion

The subsequent Lagrangian method was generalizing to n-dimension decision space:

WhileNirefers the motion caused by another krill individual;Fiimplies the foraging motion;andDirepresents the physical diffusion ofithkrill individual.

The effort of all krill individuals is determined as:

whereNmaksstands for the maximal induced speed,and based on the measured value,it could be obtained as 0.01(m/s).ωnrefers the inertia weight of motion induced from the range zero and one.αlocalirefers the local effects offered by neighbors,target is the destination way outcome given by an optimum krill individual andNiolddenotes the final motion-induced.ωndefines the inertia weight as equivalent to 0.9 at start of optimization.Afterward,it can be linearly reduced to 0.1.Fig.2 demonstrates the flowchart of KH technique.The outcome of neighbors is considered an attractive or repulsive tendency amongst the individuals to local searches.αtargetidenotes the target way outcome given by an optimum krill individuals are determined as:

whereCbestimplies the coefficient of impacts and determined as under.

where rand stands for the arbitrarily created number amongst zero and one,Irefers the actual iteration number andImaksimplies the maximal amount of iterations.

Figure 2:Flowchart of KH

4 Results and Discussion

The performance validation of the DTL-PSCC technique takes place using the PROSTATEx-2 Challenge dataset [22],which holds a set of 162 MRI images with 5 class labels namely transaxial T2W,sagittal T2W,ADC,DW,and Ktrans.In this study,the five classes are represented by targets.The results are examined under varying ratios of training and testing data.A few sample images are demonstrated in Fig.3.

Figure 3:Sample images

Tab.1 and Fig.4 offer a brief prostate cancer classification result analysis of the DTL-PSCC technique under the training/testing dataset of 80:20.The results show that the DTL-PSCC technique has obtained effective performance.For instance,the DTL-PSCC technique has classified the images into target 1 withprecn,recal,accuy,Fscore,and kappa of 84.55%,86%,85.09%,85.58%,and 84.45%respectively.

Table 1:Result analysis of DTL-PSCC technique with different measures on training/testing(80:20)

Moreover,the DTL-PSCC technique has categorized the images into target 3 withprecn,recal,accuy,Fscore,and kappa of 85.27%,86.14%,85.03%,84.55%,and 84.41% respectively.Furthermore,the DTL-PSCC technique has identified the images into target 5 withprecn,recal,accuy,Fscore,and kappa of 85.06%,85.38%,84.75%,85.03%,and 85.50%respectively.

Fig.5 demonstrates the average prostate cancer detection results of the DTL-PSCC technique under the training/testing dataset of 80:20.The figure reported that the DTL-PSCC technique has accomplished improved classification performance with the averageprecn,recal,accuy,Fscore,and kappa of 85%,85.82%,85.09%,85.02%,and 84.94%respectively.

Figure 4:Result analysis of DTL-PSCC technique on training/testing(80:20)

Figure 5:Average analysis of DTL-PSCC technique on training/testing(80:20)

Fig.6 illustrates the accuracy analysis of the DTL-PSCC methodology on training and testing(80:20) dataset.The outcomes exhibited that the DTL-PSCC approach has accomplished increased efficiency with higher training and validation accuracy.It can be demonstrated that the DTL-PSCC manner has reached increased validation accuracy over the training accuracy.

Fig.7 showcases the loss analysis of the DTL-PSCC methodology on training and testing(80:20)dataset.The results established that the DTL-PSCC approach has resulted in a proficient outcome with the decreased training and validation loss.It can be stated that the DTL-PSCC technique has lower validation loss over the training loss.

Figure 6:Accuracy analysis of DTL-PSCC technique on training/testing(80:20)

Figure 7:Loss analysis of DTL-PSCC technique on training/testing(80:20)

Tab.2 and Fig.8 provide a detailed prostate cancer classification result analysis of the DTL-PSCC approach under the training/testing dataset of 70:30.The outcomes outperformed that the DTL-PSCC method has obtained effective performance.For instance,the DTL-PSCC algorithm has classified the images into target 1 with theprecn,recal,accuy,Fscore,and kappa of 84.95%,85.14%,84.77%,84.81%,and 84.76%respectively.Moreover,the DTL-PSCC method has categorized the images into target 3 with theprecn,recal,accuy,Fscore,and kappa of 84.87%,85.26%,85.24%,84.32%,and 85.58%correspondingly.Also,the DTL-PSCC methodology has identified the images into target 5 with theprecn,recal,accuy,Fscore,and kappa of 84.63%,85.34%,85.22%,85.03%,and 84.51%correspondingly.

Table 2:Result analysis of DTL-PSCC technique with different measures on training/testing(70:30)

Figure 8:Result analysis of DTL-PSCC technique on training/testing(70:30)

Fig.9 depicts the average prostate cancer detection outcomes of the DTL-PSCC methodology under the training/testing dataset of 70:30.The figure stated that the DTL-PSCC technique has accomplished increased classification performance with averageprecn,recal,accuy,Fscore,and kappa of 84.64%,85.51%,84.87%,85%,and 84.88%correspondingly.

Fig.10 portrays the accuracy analysis of the DTL-PSCC approach on training and testing(70:30)dataset.The results proved that the DTL-PSCC methodology has achieved improved results with increased training and validation accuracy.It is noticed that the DTL-PSCC technique has gained improved validation accuracy over the training accuracy.

Figure 9:Average analysis of DTL-PSCC technique on training/testing(70:30)

Figure 10:Accuracy analysis of DTL-PSCC technique on training/testing(70:30)

Fig.11 depicts the loss analysis of the DTL-PSCC algorithm on training and testing (70:30)dataset.The outcomes established that the DTL-PSCC system has resulted in a proficient outcome with the decreased training and validation loss.It can be stated that the DTL-PSCC methodology has obtainable minimal validation loss over the training loss.

Figure 11:Accuracy analysis of DTL-PSCC technique on training/testing(70:30)

Lastly,a detailed comparative analysis of the DTL-PSCC technique with recent methods is offered in Tab.3 and Fig.12.The results demonstrated that the LL-support vector machine (SVM),LLlogistic regression with L1 penalty(LLR)(LLR),and HL-SMC methods have obtained least prostate classification performance withaccuyof 33.92%,28.57%,and 47.30% respectively.Meanwhile,the decision tree(DT)and random forest(RF)models have obtained moderate outcomes withaccuyof 72.81%and 75.24%respectively.However,the DTL-PSCC technique has shown supreme performance withprecn,recal,accuy,Fscore,and kappa of 85%,85.82%,85.09%,85.02%,and 84.94% respectively.Therefore,it is ensured that the DTL-PSCC technique has gained maximum prostate classification performance over the other compared methods.

Table 3:Comparative analysis of DTL-PSCC technique with existing approaches

Figure 12:Comparative analysis of DTL-PSCC technique with existing approaches

5 Conclusion

In this study,an effective DTL-PSCC technique has been developed to classify prostate cancer using MRI images.The proposed DTL-PSCC technique involves several subprocesses namely preprocessing,EfficientNet based feature extraction,FKNN based classification,and KHA based parameter tuning.The membership value of the FKNN model can be optimally tuned by the use of KHA which results in improved classification performance.In order to demonstrate the enhanced classification outcome of the DTL-PSCC system,a wide range of simulations takes place on benchmark MRI datasets.The extensive comparative results ensured the advancement of the DTL-PSCC system over the recent methods with the higher accuracy of 85.09%.Hence,the DTL-PSCC technique has appeared as a proficient approach for prostate cancer classification and detection.In future,deep learning based segmentation techniques can be derived to improve the efficiency of the DTL-PSCC technique

Acknowledgement:The authors would like to acknowledge the support of Prince Sultan University for paying the Article Processing Charges(APC)of this publication.

Funding Statement:The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number (RGP 2/25/43).Taif University Researchers Supporting Project Number(TURSP-2020/346),Taif University,Taif,Saudi Arabia.

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.