An Efficient Method for Covid-19 Detection Using Light Weight Convolutional Neural Network

2021-12-15SaddamBekhetMonagiAlkinaniReinelTabaresSotoandHassaballah

Computers Materials&Continua 2021年11期

Saddam Bekhet,Monagi H.Alkinani,Reinel Tabares-Soto and M.Hassaballah

1Faculty of Commerce,South Valley University,Qena,Egypt

2Department of Computer Science and Artificial Intelligence,College of Computer Science and Engineering,University of Jeddah,Jeddah,21959,Saudi Arabia

3Department of Electronics and Automation,Universidad Autónoma de Manizales,Manizales,170001,Colombia

4Department of Computer Science,Faculty of Computers and Information,South Valley University,Qena,Egypt

Abstract:The COVID-19 pandemic is a significant milestone in the modern history of civilization with a catastrophic effect on global wellbeing and monetary.The situation is very complex as the COVID-19 test kits are limited,therefore,more diagnostic methods must be developed urgently.A significant initial step towards the successful diagnosis of the COVID-19 is the chest X-ray or Computed Tomography(CT),where any chest anomalies(e.g.,lung inflammation)can be easily identified.Most hospitals possess X-ray or CT imaging equipments that can be used for early detection of COVID-19.Motivated by this,various artificial intelligence (AI) techniques have been developed to identify COVID-19 positive patients using the chest X-ray or CT images.However,the advance of these AI-based systems and their highly tailored results are strongly bonded to high-end GPUs,which is not widely available in several countries.This paper introduces a technique for early COVID-19 diagnosis based on medical experience and light-weight Convolutional Neural Networks(CNNs),which does not require a custom hardware to run compared to currently available CNN models.The proposed deep learning model is built carefully and fine-tuned by removing all unnecessary parameters and layers to achieve the light-weight attribute that could run smoothly on a normal CPU(0.54%of AlexNet parameters).This model is highly beneficial for countries where high-end GPUs are luxuries.Experimental outcomes on some new benchmark datasets shows the robustness of the proposed technique robustness in recognizing COVID-19 with 96%accuracy.

Keywords:Artificial intelligence;COVID-19;chest CT;chest X-ray;deep learning

1 Introduction

In January 2020,the World Health Organization announced a Public Health Emergency of International Concern (PHEIC) due to the world-wide spread of Coronavirus disease 2019(COVID-19).Human coronaviruses (CoV) belong to order Nidovirales,family Coronaviridae,subfamily Coronavirinae [1].There are viruses in the subfamily Coronavirinae that can be classified into four types:α,β,γandδ.CoVs (α,β,γandδ) primarily infect a wide variety of animal species,including mammals and birds,mostly in respiratory and gastrointestinal tract.Although individual virus species mostly appear to be limited to a narrow host range comprising a single animal species,genome sequencing indicates that the CoVs had crossed the host species barrier frequently [2].The winter of 2002 witnessed the emergence of severe acute respiratory syndrome(SARS) disease,which was quickly attributed to a new CoV,the SARS-CoV [3].Afterwards,near the end of 2019 a novel class ofβ-coronavirus showed up,which is SARS-CoV-2 (COVID-19).The Coronavirus is incredibly irresistible and in genuine cases may bring about intense respiratory distress or organ failure [1].

The number of positive cases is growing exponentially everywhere in the world day after day,and the virus infected more than 100 million people to date.Health systems of several countries come to the point of collapse because of this fast growth rate in the infected cases [4].Now most countries face shortage of ventilators and testing kits.Thus,they have declared lockdown and requested people to avoid gatherings and stay indoors.Due to the lack of available diagnostic instruments,the medical situation is complicated where many countries are only able to apply restricted COVID-19 tests [3,4].Despite significant efforts to find an efficient way to detect COVID-19,the availability of suitable medical resources in many countries is a major challenge.Therefore,there is an urgent need to find a quick and low-cost tools for early COVID-19 detection and diagnosis.

Figure 1:Sample normal scans chest X-rays vs. ones diagnosed with COVID-19 images are from the covid-chestxray dataset [5]

Attempts have been made to find an effective and easy way to identify infected patients early.Typically,a reverse transcription polymerase chain reaction verifies the disorder (RT-PCR).However,for early detection and evaluation of reported patients,the RT-PCR sensitivity may not be high enough [6].However,as a non-invasive imaging procedure,the X-rays and Computed Tomography (CT) can classify certain characteristic manifestations in the lungs.Thus,for early evaluation of COVID-19 and other types of pneumonia,the X-rays and CT scans can be used.To highlight the discrepancy,some regular and COVID-19 positive study chest X-ray images are displayed along with their clinical diagnosis in Fig.1.Bilateral lung infiltrates (areas marked with red) are seen by the chest X-ray of COVID-19 cases and display a homogeneous opacity of the infected lungs (i.e.,mostly pneumonic opacity).

Furthermore,using AI based techniques has grown exponentially in recent years in many areas of medical practice and healthcare [7].The AI-based techniques do not complain fatigue,thus,they can process large quantities of data at very high speed out-performing humans’accuracy in the same job.AI is applied in almost each field of medicine such as drug design and discovery and patient monitoring [8].For instance,AI is used in medical technologies to improve the diagnostical capability of clinicians,especially in multi-disease diagnosis [9-11],and medical image analysis [11].With progress in employing more intelligent AI techniques in healthcare,patients can be diagnosed professionally and faster,thus they may start treatment sooner.

Recently,huge efforts of research have been made to diagnose COVID-19 using AI [10,11].In more detail using deep convolutional neural networks,which made a revolution in numerous fields of science [12]by introducing non-traditional and efficient solutions to many image-related problems that had long remained unsolved or partially addressed.For example,deep learning has achieved remarkable performance for several visual tasks such as object segmentation in medical applications [13]and cancer MRI images classification [14].In the context of COVID-19 detection,deep learning was utilized in [15]to extract regions from chest X-ray images that may identify features of COVID-19.Transfer learning was also used for pneumonia classification and visualization [11].In [16],a model for automatic detection of COVID-19 infection form raw chest X-ray images based on deep neural networks was proposed.From a computational perspective,the rise of DL-Based models has been fueled by the improvements in hardware accelerators.The GPU continues to remain the most widely used accelerator for DL applications [17].Furthermore,as DL models are getting progressively unavoidable and exact,their computing and memory necessities are developing hugely and are probably going to outpace the upgrades in GPU assets and execution.For instance,training CNNs takes a gigantic measure of time (e.g.,100-epoch training of ResNet-50 on ImageNet dataset using one M40 GPU requires 14 days) [17].However,GPUs are not affordable for every research and even for some institutions with limited funding due to its high price.Even for cloud-based GPU’s an expensive monthly rental charge need to be paid.Furthermore,cloud-Based GPU’s requires a high-speed broadband connection for data uploading,which is still a problem for some countries and hospitals where broadband availability is problematic.

Conclusively,the aim of this paper is to develop an efficient and effective COVID-19 detection technique from the chest X-ray and CT images.The effectiveness aspect is achieved using deep learning-based methods,which is the state-of-art in pattern recognition.The efficiency aspect is achieved by implementing a light-weight deep learning model that does not require a GPU or a custom hardware to run.Thus,the contribution of this paper is presenting a hybrid lightweight CNN model with＜400 K parameters to effectively diagnose COVID-19 from either chest X-ray or CT images.The proposed network model is stripped down from all of the unnecessary layers and related parameters to enable its operation on normal CPUs.This is in contrast to the majority of current CNN models that require high end GPUs to run.The model contains a quite compact number of trainable parameters 335442,which represents 0.54% of the famous AlexNet model (61 M) [18],0.24% of the VGG16 based model (138 M) model [19],29% of the DarkNet(1127334) [16],and 2.85% of COVID-Net (11.75 M) [20].Finally,the presented results in this paper provide a speedy and solid start-up in the fight against COVID-19.This is the case when doctors are required to test a huge number of patients in a limited time period.

The reminder of the paper is organized as follows.Section 2 presents available related literature work.The proposed deep learning CNN model is presented in Section 3.Section 4 is dedicated for the experimental part and associated results’discussion.Finally,the paper is concluded and summed-up in Section 5.

2 Literature Review

In the past year,the literature became crowded with COVID-19 related research,as depicted in Tab.1.However,the majority of research is not peer-reviewed yet and exists on an open-access archives.Following a careful analysis of this research,there are some serious associated drawbacks as follows:

· There are not fixed COVID-19 symptoms as they differ across countries and may overlap with other pneumonia forms (e.g.,SARS).This limits the ability to develop robust standard diagnostic techniques [3,4].

· The test data either chest X-rays or CT are very limited in quantity and quality,and hidden from researchers,which causes delays in building a robust AI technique for the greater good of the humanity.

· A group of approaches were developed using the transfer learning [19,21]scheme from the 1.2 million ImageNet dataset [18].This might be useful in non-specific image classification problems,which could generalize any learned features to other classification problems.

· The majority of computer vision labs that released early COVID-19 research made full utilization of their available hardware gears,e.g.,top-end GPU’s.This is suitable for countries that could afford such type of hardware to assist in early diagnosis of COVID-19.However,there are other countries that cannot afford such special hardware in their fight against the COVID-19 battle.

· The performance ability of deep learning techniques is mostly affected by the quantity of positive cases.Although,most of literature works used on the average a hundred of COVID-19 positive cases.This is a quite small number of the positive cases to consolidate the performance.

· The majority of available methods either utilize the chest X-ray or CT scans but not both [22-24].To the best of our knowledge there are not any method that could handle both types of images with the same settings and setup due to the differences in the structure of CT and X-ray and capturing way.

In general,much of the literature work is still immature in providing practical AI solutions that can assist in early COVID-19 detection from chest screening images for the above reasons.additionally,CNN-based approaches are efficient and fits the job as they integrate extraction and classification of features together in a robust end-to-end model that collects the raw input data and generates the final classification result.Definitively,there is still a lot of space for work to help with battling this pandemic and introducing cheap and fast solutions that can help the humankind in its battle against infections,particularly if the solutions can deal with both chest checks X-ray and CT.

Table 1:Summary of primary prior literature using deep learning approaches on COVID-19 identification

3 The Proposed Method

The field of computer vision witnessed an unprecedented use of CNNs [26],especially for video/image analysis [26].CNNs requires less pre-processing effort,this is in contrast with other different feature extraction and classification algorithms.In addition,the network is totally responsible for generating the required filters and feature maps with minimal pre-processing and human intervention.At the hearth of deep learning work is input images convolutional operations.The convolution operation is described as follows in Eq.(1):

The input image is indicated by Z in this case and R is the required convolution matrix of 2d that slides over the entire image Z.The operatorrepresents the discrete convolution operation.The generic structure of the proposed deep learning CNN for COVID-19 detection is depicted in Fig.2.

The core aim of this paper is to develop a light-weight deep CNN model.Although this idea contradicts the common theory of CNN models to stack as many layers as possible [27],this light-weight idea suits the specific problem of COVID-19 small available data and to afford the model to run in places with limited available computing power.However,throughout literature the problem of building the light-weight CNN model achieves a limited performance [28].The problem can be stated as building a deep learning model using the fewest number layers/parameters with the maximum accuracy.Given a deep learning CNN model composed of N layers L1,L2,...,LNarranged in a specific order,the deep learning network model NW can be defined by:

Each layer consists of a sequence of trainable parameters.Hence,the total number of trainable parameters in the whole network is given by

whereφ(Li)is a function that retrieves the number of trainable parameters at a given layerLi.Furthermore,the performance of a complete epochs run is

where NWiis the ithnetwork model trail,while Piis the performance measure,(i.e.,accuracy).The targeted light-weight network model could be converted to a minimization problem defined as follows:

Practically,the problem can be approached by controlling the number of network layers,and testing for corresponding accuracy performance.However,with each added layer there are tens of parameters have to be optimized such as kernel size,activation function,batch size,...etc.Thus,each of the key parameters is investigated in detail to select its optimal value that achieves the maximum accuracy with the lowest running cost.For abstractness purpose,the final COVID-19 detection CNN layers’details are illustrated in Tab.2.However,the full experimental analysis of tuning various key network parameters is explained in the next section.

Figure 2:General structure of the proposed COVID-19 CNN diagnosis system

Table 2:The layers and layer parameters of the proposed CNN model.The last dense layer shape is based on the 10 pneumonia cases depicted in the covid-chestxray dataset [6]

4 Experimental Setup

The efficiency of the proposed CNN model for COVID-19 detection is explored in this section.Also,a full description of the covid-chestxray datasets,parameters setting and tuning during the training stage and their validation at the testing stage are presented.At the end,a thorough analysis of the obtained results is presented.

4.1 Parameters Setting and Tuning

The first network parameter to be investigated is the total number of trainable parameters in the whole network as described in the previous section.This parameter is implicitly controlled by the number of the network layers.Fig.3 displays the effect of increasing the number of parameters on the model accuracy after 21 full models run (full epochs for each run).The figure depicts a performance peak of 96% accuracy at 335,442 trainable parameters.This represents the optimal number of the required parameters to run the network effectively.

Figure 3:Impact of increasing training parameters on the performance of the proposed network.The peak is detected using 335,442 trainable parameters

Also,the proposed CNN model was tested against varying the convolutional filter size as depicted in Fig.4,which shows the best performance of the model with 5×5 filter size,where the curve flattens with less accuracy for high filter sizes≥6×6.This filter size (5×5) is marginally greater than the standard CNN filter size [27]but it easy to spot lung anomalies that arise in larger areas.

Figure 4:Impact of increasing convolutional filter size on the accuracy of the network.the peak is detected using 5×5 filter

Also,a number of activation functions are examined as well (i.e.,Sigmoid,Softmax,ReLU,Tanh,Exponential,Hard Sigmoid and Softplus).Fig.5 depicts the performance of proposed COVID-19 model under various activation functions.The outcomes suggest that the sigmoid function,Eq.(6),is the best one,since it limits the f(x)from a wide scale to [0,1].This fits the multi-class classification issue of covid-chestxray as it includes ten different diagnoses for the represented cases.

The batch size is another important factor to be examined as it controls the number of training examples utilized in a single iteration.Fig.6 shows the effect of varying the batch size on the proposed model performance.It is obvious that the batch size of ten examples per iteration is the best,where high batch sizes did not improve the performance but added an extra memory load on the system.

4.2 Chest X-ray Dataset

A public dataset of pneumonia chest X-ray cases [6]is used in this paper.The dataset portrays nine types of pneumonia (e.g.,MERS,COVID-19,MERS,SARS,ARDS) and some normal X-ray cases as demonstrated in Fig.7.The group of nine distinctive pneumonia cases portrayed in the dataset helps basically in decreasing the underfitting of the proposed network model as the model requirements to learn numerous varieties among the nine pneumonia cases.The dataset is continually refreshed with pictures from different open access sources.Till the time of publishing the paper,the dataset reached 951 chest X-beam pictures from 481 subjects.There are 558 males and 311 females,while the rest are missing the gender information.The minimum and maximum cases ages are 18 and 94 years with an average age of 40 years.583 of the cases are COVID-19 positive,while the remaining are either normal or depict other pneumonia types.

Figure 5:The effect of activation function on the proposed COVID-19 network accuracy performance.The best function is Sigmoid

Figure 6:Impact of the batch size t on the proposed COVID-19 network accuracy.The optimal batch size is ten

4.3 Network Training Phase

As the majority of available COVID-19 datasets are limited in size and definitely induce an overfitting effect,data augmentation techniques are essential to tackle this problem and increase the dataset size artificially with label-preserving techniques [29].Practically,the entire dataset images were first resized to 200×200 pixels-and parsed through a randomized reflection and/or translating in ±30 range.This is necessary and common to prevent the positional bias in the results [29].Furthermore,the entire augmented images are fabricated during runtime to reduce the computational load.The training and validation performance of the proposed CNN network model in Fig.8.

Figure 7:Analysis of the cases X-ray images depicted in the covid-chestxray dataset [6]

Figure 8:The training and validation performance of the proposed CNN network model in first 1000 epochs.The graph depicts the performance stability after the first 500 epochs with 96%accuracy

The training process for the proposed deep model is performed using the stochastic gradient descent SGD [30].The SGD is important as it updates the parameters with mini-batch B=10 examples.The momentum was set to 9×10-1and the weight decay was set to 1×10-3,as the network is considered a shallow network [31].The weight decay marginal value is important as it helps to minimize the model training error [31].The training and results are performed using an Intel Core i5 machine,2.9 GHZ occupied with 8GB of working RAM.This hardware gear represents and average processing power computer with no installed GPU.The network is implemented and trained using TensorFlow framework [32].

4.4 Results and Discussion

This section introduces the testing protocol for the proposed CNN model along with analysis.However,the chest-xray-images datasets do not have a prior configured test-split (and the majority of COVID-19 dataset as well),thus,the common random 70%-30% training-validation is adopted during the experiments.Regarding the quantitative evaluation,a group of standard measures [33]are used,i.e.,Accuracy,Sensitivity,Specificity,Mean Absolute Error (MAE) and Area Under ROC (Receiver Operations Characteristics) Curve (AUC).These metrics are defined in the following equations:

where TP is True Positive,TN is True Negative,FP is False Positive and FN is False Negative.yiis the correct class label,is the predicted class label and N is the total classified cases.Additionally,the validation-loss metric is also used to provide an additional indicator of the model efficiency since it demonstrates how well the model performance generalizes to unseen yet data.Theand yiare as defined in Eq.(10) and the individual loss functionλ,(i.e.,log-loss in this case),where.

The proposed CNN model performs 96% accuracy on the chest-xray dataset,while the logloss is 0.2.This is a rather positive finding given,the light-weight design of the proposed CNN model-and the small dataset size.Tab.3 shows the values of the five-performance metrics on the chest-xray dataset,which reflects the robust performance of the proposed network.

Table 3:Performance metrics of the proposed network model on the chest X-ray dataset

Furthermore,the proposed CNN model accuracy is verified against seven additional baselines(GPU-based) that reflects the most recent work regarding COVID-19 detection using deep learning models.The comparison shown in Fig.9 confirms the effective performance of the proposed CNN model for COVID-19 detection—as it performes higher rate with 6.4 ± 3.7% than the other baselines.Also,the results are further consolidated by an expert radiology team,where the same accuracy measure is adopted to quantify this experiment.For every X-ray image that is correctly classified by the proposed CNN model,it is depicted to radiologists to manually reclassify it.Finally,the obtained radiological-based accuracy is 99%,which further confirms the robustness aspect.

Figure 9:Performance of the proposed CNN COVID-19 detection model vs.,DRE-NET [22],M-Inception [23],UNet+3D deep network [24],COVIDX-Net [25],COVID-Net [20],transfer learning (Inception) [21]and transfer learning (VGG-16) [19]

Moreover,in order to emphasize the effectiveness of the proposed model,it is tested on two other publicly available COVID-19 datasets.The first DS1 [34]is a group of 98 X-ray cases,70 of them are COVID-19 positive,while the remaining cases are normal.The second DS2 [16]is 1125 X-ray cases,125 cases are for positive COVID-19 patients,while 500 cases are diagnosed with pneumonia and the last 500 are for normal cases.In addition,one of the largest chest CT images [35]dataset,i.e.,DS3,is used as well to verify the hybridness aspect of the model.This dataset consists of 2482 CT images,1251 cases are COVID-19 positive while the remaining 1231 are normal cases.Fig.10 depicts some illustrative sample cases from DS1,DS2 and DS3 datasets consequentially.

The proposed model accuracy (%) performance on the DS1,DS2 and DS3 datasets is shown in Fig.11 and Tab.4.The figure reflects the stable performance of the model across the DS1 and DS2.The accuracy on the DS2 is＜90%,because the dataset is unbalanced,where it only contains 10% of COVID-19 positive cases and causes an underfitting problem.Furthermore,the obtained result shows a robust performance (92.3%) on the DS3,which is composed of chest CT images.However,the model is trained from scratch on the DS3 due to the difference in X-ray and CT characteristics,where the X-ray based network knowledge cannot be transferred directly to the CT dataset.

Figure 10:Illustrative samples for positive and negative COVID-19 cases from DS1 [34],DS2 [16]and DS3 [35].DS1 and DS2 are chest X-ray images while DS3 is chest CT images

Figure 11:Performance of the proposed model on two additional COVID-19 public available;DS1 [34],DS2 [16].DS0 is the covid-chestxray dataset added for illustration

Table 4:Performance metrics of the proposed network model on different datasets

Regarding the hybrid nature of the proposed CNN model;its performance based on CT images (DS3),was benchmarked against three recent baselines that utilize CT images as well.The results depicted in Fig.12 confirms the proposed model effectiveness on CT images,where it outperforms the rest of the baselines with 5.5±4% accuracy.This comparison confirms the hybrid nature of the model that can be used for X-ray and CT images with the same structure and parametrization but with full retraining on the CT data.

Figure 12:Performance of the proposed COVID-19 CNN model using CT data against DRENET [22],M-Inception [23]and UNet+3D deep network [24]

Figure 13:Number of parameters of the proposed COVID-19 CNN model compared to COVIDNet [20],DarkNet [16],VGG16 [19]and AlexNet [18]CNN models

Finally,to emphasize the light-weight aspect of the proposed COVID-19 CNN model,Fig.13 depicts a comparison based on the number of parameters for a group of recent and benchmark baselines.The proposed model parameters represent 8.1% on average of the other models’parameters.

5 Conclusion

A fully automated hybrid CNN model for COVID-19 detection from either the chest X-ray or CT images has proposed in this research paper.The introduced model achieved 96% and 92.08% accuracy on X-ray and CT images respectively.In contrast with the current research,the proposed CNN model is light-weight and only contains 335,442 trainable parameters.This is a quite compact number of parameters that does not require any custom hardware to run making the model suitable in places with limited medical fund.In addition,the model output was clinically validated,with specialized radiologists.Thus,the findings presented in this paper are encouraging,where the proposed CNN model can be packaged and used in areas that are with short of radiologists’assistance for fast diagnosis.Regarding future work,the model will be retrained and reused for detecting and diagnosing other types of pneumonia.

Acknowledgement:Thanks for Dr.Maher Salama and the radiology team at South Valley University hospitals for providing the clinical feedback in the paper including comment on figures captions and validating the model outputs.Also,Dr.Monagi H.Alkinani extends his appreciation to the Deputyship for Research &Innovation,Ministry of Education in Saudi Arabia for supporting his research work through the Project Number MoE-IF-20-01.

Funding Statement:The authors received no specific funding for this study.

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

Computers Materials&Continua

2021年11期