APP下载

Unknown DDoS Attack Detection with Fuzzy C-Means Clustering and Spatial Location Constraint Prototype Loss

2024-03-13ThanhLamNguyenHaoKaoThanhTuanNguyenMongFongHorngandChinShiuhShieh

Computers Materials&Continua 2024年2期

Thanh-Lam Nguyen ,Hao Kao ,Thanh-Tuan Nguyen ,Mong-Fong Horng,⋆ and Chin-Shiuh Shieh,⋆

1Department of Electronic Engineering,National Kaohsiung University of Science and Technology,Kaohsiung,807618,Taiwan

2Department of Electronic and Automation Engineering,Nha Trang University,Nha Trang,650000,Vietnam

ABSTRACT Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications in education,healthcare,entertainment,science,and more are being increasingly deployed based on the internet.Concurrently,malicious threats on the internet are on the rise as well.Distributed Denial of Service(DDoS)attacks are among the most common and dangerous threats on the internet today.The scale and complexity of DDoS attacks are constantly growing.Intrusion Detection Systems (IDS) have been deployed and have demonstrated their effectiveness in defense against those threats.In addition,the research of Machine Learning(ML)and Deep Learning(DL)in IDS has gained effective results and significant attention.However,one of the challenges when applying ML and DL techniques in intrusion detection is the identification of unknown attacks.These attacks,which are not encountered during the system’s training,can lead to misclassification with significant errors.In this research,we focused on addressing the issue of Unknown Attack Detection,combining two methods:Spatial Location Constraint Prototype Loss(SLCPL)and Fuzzy C-Means(FCM).With the proposed method,we achieved promising results compared to traditional methods.The proposed method demonstrates a very high accuracy of up to 99.8% with a low false positive rate for known attacks on the Intrusion Detection Evaluation Dataset(CICIDS2017) dataset.Particularly,the accuracy is also very high,reaching 99.7%,and the precision goes up to 99.9% for unknown DDoS attacks on the DDoS Evaluation Dataset (CICDDoS2019) dataset.The success of the proposed method is due to the combination of SLCPL,an advanced Open-Set Recognition(OSR)technique,and FCM,a traditional yet highly applicable clustering technique.This has yielded a novel method in the field of unknown attack detection.This further expands the trend of applying DL and ML techniques in the development of intrusion detection systems and cybersecurity.Finally,implementing the proposed method in real-world systems can enhance the security capabilities against increasingly complex threats on computer networks.

KEYWORDS Cybersecurity;DDoS;unknown attack detection;machine learning;deep learning;incremental learning;convolutional neural networks (CNN);open-set recognition (OSR);spatial location constraint prototype loss;fuzzy c-means;CICIDS2017;CICDDoS2019

1 Introduction

After the outbreak of the new Coronavirus pneumonia in 2020,people have become more reliant on the internet.Entertainment,shopping,education,and other remote activities on the Internet have become increasingly diverse and rapidly growing.Naturally,the number of DDoS attacks has been significantly increasing.CloudFlare,a content delivery network (CDN) and attack mitigation service provider,published a quarterly investigation on DDoS attacks[1]showing thousands of attacks occurring every month.While most attack traffic remains below 500 Mbps,this volume is sufficient to cause short disruptions for some enterprise services.However,the stability of networks and services is of utmost importance for a business service provider.The service disruption caused by an attack can lead to business losses,and more importantly damage to their image and reputation.

Defending enterprise network systems against DDoS attacks is an essential demand.Nevertheless,DDoS attack methods are continually evolving and becoming more diverse.In contemporary times,the IDS system plays a pivotal role in securing computer networks by identifying and responding to malicious activities in general,and DDoS attacks in particular.Researching and integrating cuttingedge technologies,such as ML and DL technologies into IDS systems to enhance their capabilities is an inevitable trend.Many related experiments have demonstrated that ML and DL methods achieve very high accuracy,up to 98%,on conventional data [2–4].However,existing ML and DL methods primarily focus on modeling and normalizing known attack patterns.Consequently,these methods often struggle to effectively identify unknown attacks with new characteristics,leading to a reduction in the defensive efficacy of IDS systems.Therefore,the identification of unknown DDoS attacks continues to pose a significant challenge for IDS systems.

The main objective of our research is to propose multiple IDS methods capable of concurrently detecting both known and unknown DDoS attacks and evaluating their detection performance.To address this challenge,we emphasize the necessity of the Open-Set Recognition (OSR) technique.The OSR technique deals with the challenge of identifying and classifying objects or instances not encountered during the training phase.Recently,there have been several outstanding achievements in OSR techniques[5].Notably,the Spatial Location Constraint Prototype Loss(SLCPL)[6]method is a novel OSR technique designed for deep neural networks.It has demonstrated superior effectiveness compared to many previous OSR techniques when testing on many different datasets,with a majority achieving accuracy above 88%[6].However,when applied alone in the context of detecting unknown DDoS attacks,the effectiveness of SLCPL is not excellent enough.Therefore,we have considered a supporting method,the Fuzzy C-Means (FCM) [7] clustering technique,with a prominent soft clustering feature.The combination of SLCPL and FCM enhances the ability to recognize changes in data patterns,increasing the accuracy of identifying unknown attacks.

Through the utilization of the FCM and the SLCPL,our proposed method achieves an impressive accuracy of up to 99.7% and a precision of up to 99.9% for unknown DDoS attack detection on the open CICDDoS2019 dataset.Simultaneously,it maintains a high accuracy of 99.8% for known attack detection on the well-known CICIDS2017 dataset.With this achievement,we aim to enhance tech organizations’detection capabilities,safeguarding network infrastructure and ensuring service continuity amid evolving network threats.

The key contributions of this research are focused on the following aspects:

•We have selected AlexNet[8]as the neural network architecture for our training process,and we have enhanced the AlexNet architecture for more effective classification of conventional data.

• Through the SLCPL and FCM methods,we have adjusted the positions of unknown attack samples in the feature space,bringing them closer to specific categories and thus enabling the recognition of unknown DDoS attacks.

The structure of this article is as follows: Section 2 presents a concise summary of pertinent literature.Section 3 encompasses the system’s architecture,the methodologies,and the algorithms utilized.Section 4 provides a detailed account of the experimental procedure and presents the findings obtained.The research is concluded in Section 5,which also explores potential directions for future research.

2 Related Work

Detecting and mitigating DDoS attacks is a top concern in modern cybersecurity.Several methods have been proposed to handle this problem,varying from traditional signature-based methods to more advanced anomaly detection techniques.

2.1 DDoS Attack Detection Technique

Traditional signature-based methods have undeniable advantages such as effectiveness against known DDoS attacks,low false positive rate,and ability to respond quickly to attacks.However,the signature-based method clearly shows its disadvantages against unknown DDoS attacks,zeroday attacks as well and high dependence on databases.Researchers have recognized the limitations of signature-based systems and have explored other approaches.In recent years,ML and DL technologies have been a research trend in developing IDS systems[9].

Recent research by Maseer et al.[2]synthesized and compared current popular ML algorithms on the CICIDS2017 dataset.In their research,multiple algorithms have demonstrated highly favorable outcomes,including Support Vector Machine (SVM),Random Forest (RF),Naive Bayes (NB),Artificial Neural Networks(ANN),and Convolutional Neural Network(CNN).

Ho et al.[3]introduced a novel CNN approach that surpasses conventional single-class classification methods,delivering exceptional performance in the realm of multi-class classification,particularly in the identification of both known and unknown attacks.Furthermore,many studies have applied CNN-based models,producing highly effective results[10,11].

Kim et al.[4] introduced a DL model that combines CNN and Recurrent Neural Networks(RNN)to detect Denial of Service(DoS)attacks.They optimized the CNN design through numerous experiments and achieved impressive results,particularly in terms of accuracy on two datasets KDD and CSE-CIC-IDS2018.

Kiranyaz et al.[12] surveyed 1D Convolutional Neural Networks (1D-CNN) and their main applications.The 1D-CNN model [12,13] has demonstrated its advantages in a scarce training data environment as well as in some specific fields such as anomaly detection,personalized biomedical data classification,early diagnosis,etc.

Beitollahi et al.[14] proposed an ML approach for detecting DDoS traffic.Their approach employs a Radial Basis Function (RBF) network in conjunction with the cuckoo search algorithm(CSA)to identify Application-layer DDoS attacks.This approach stands out for its effective performance in identifying DDoS traffic compared to several other methods,with notably low error rates and high precision.

Laghrissi et al.[15]presented an IDS that utilizes Long Short-Term Memory(LSTM)networks,in combination with Principal Component Analysis(PCA)and Mutual Information(MI)techniques.This method achieves outstanding accuracy compared to other methods in both binary and multiclass classification.

2.2 Open Set Recognition Technique

The Open Set Recognition(OSR)technique aims to classify data into known classes and detect instances that do not belong to any known class,unknown class,or new instances[5].

Recent advancements in OSR techniques include the work of Bendale et al.[16],who introduced the OpenMax technique.This technique has become a crucial method in the field of open-set recognition,where systems need the capability to recognize objects that do not belong to any of the classes learned during training.The fundamental theory behind the OpenMax algorithm involves calculating the“openness”level of a data sample based on the distance between that sample and the nearest samples in the nearest known class.OpenMax allows for the determination of whether a data sample belongs to a known class,an open-set class(a class corresponding to openness),or is unknown(not belonging to any known class).This enhances recognition capabilities in real-world situations where objects that do not belong to any known class may appear.

Ge et al.[17]have published their research results as a development method of OpenMax.Unlike previous methods where unknown classes were inferred based on characteristics or distance decisions with known classes,the authors’novel approach offers a clear framework and decisive criterion for unknown classes.The proposed technique,known as Generative OpenMax,enhances OpenMax by incorporating Generative Adversarial Networks(GANs)to create images for new classes artificially.Additionally,Yoshihashi et al.[18]have proposed the Classification-Reconstruction learning for the OSR method (CROSR) to enhance the reliable detection of unknown classes without affecting the accuracy of known classes.Research results have shown the superiority of this method over traditional approaches.

Generally,Open-Set Recognition has been receiving the attention and research efforts of many scientists and scholars worldwide.The mentioned papers provided earlier represent only a fraction of the noteworthy contributions within this field.

2.3 Unknown DDoS Attack Detection Technique

The task of identifying unknown DDoS attacks is difficult,requiring inventive methods to improve the protection of network infrastructures.In recent years,researchers have made significant progress in the field of unknown DDoS attack detection,employing various techniques and methodologies to identify and mitigate these threats[19].This section provides an overview of recent advances in unknown DDoS attack detection,highlighting key contributions in this domain.

Extreme Value Theory (EVT) has been utilized to enhance the identification of DDoS attacks[20].This statistical methodology has proven effective in capturing the extreme behaviors of network traffic,aiding in the recognition of anomalous patterns associated with DDoS attacks.

Gaussian Mixture Models(GMMs)and related methods have been widely used for determining the distribution of network traffic data [21,22].Chapaneri et al.have explored the use of several GMMs for modeling individual input features in the context of DDoS detection[21].By modeling the underlying data distribution,this technique can help distinguish between benign and malicious traffic patterns,contributing to more accurate detection.Shieh et al.have adopted DL techniques,including Bidirectional Long Short-Term Memory(BI-LSTM)networks and GMMs,for the identification of unknown DDoS attacks [22].Their research demonstrates the efficacy of combining deep learning and statistical modeling for robust attack identification.

Yang et al.have introduced the AutoEncoder-based DDoS attacks Detection Framework (AED3F),a novel technique for threat detection,which has shown promise in identifying anomalous network behavior [23].By leveraging autoencoders,they contribute to the arsenal of tools for enhancing DDoS attack detection.

Generative Adversarial Networks(GANs)have demonstrated efficacy in DDoS attack detection[24,25].GANs are proficient in generating synthetic data that can be used to compare and contrast with real network traffic,aiding in the identification of malicious activity.Lin et al.have presented the IDSGAN framework,which incorporates GAN networks as part of a defense system to protect against DDoS attacks[24].This approach focuses on the proactive use of GANs to safeguard network resources.Chauhan et al.have harnessed Wasserstein GAN (WGAN) to address training issues in DDoS detection models [25].By incorporating WGAN,they aim to improve the robustness and reliability of DDoS detection techniques.

2.4 Fuzzy C-Means Clustering and Comparison between Recent Algorithms

Fuzzy C-Means (FCM) clustering is a common unsupervised learning technique used for data clustering and classification.In the context of DDoS attack detection,FCM has been applied to group network traffic data into clusters,allowing the identification of abnormal traffic patterns related to attacks.The work by Wu et al.[26] used FCM clustering to partition network traffic data into distinct clusters and then employed anomaly detection techniques to identify DDoS attacks within these clusters.This approach has shown promising results in the identification of both known and unknown attacks.

In this section,we have discussed various approaches related to DDoS attack detection,OSR techniques,and Fuzzy C-Means clustering.For convenience,we have established a comparison table among recent research studies in machine learning and deep learning.Table 1 summarizes the techniques used,the scope of the problems,and some limitations of the studies.

Table 1:Comparison between recent machine learning techniques

Recent notable studies have primarily focused on addressing the challenge of Closed Set Recognition(CSR).Some performance evaluations are conducted on outdated datasets(KDD99)or specific datasets such as NSL-KDD or CICIDS2017.Regarding methods for tackling the OSR problem,some approaches exhibit drawbacks,such as complexity and difficulty in real-world deployment.Our proposed method successfully addresses both CSR and OSR problems,with performance validated on both the CICIDS2017 and CICDDoS2019 datasets.This approach achieves exceptionally high performance and flexibility for real-world deployment.Of course,our method still has some limitations,which will be discussed in the section“Discussion”of this article.

In the following sections,we will explain our method in detail,a solution involving a deep neural network combined with the SLCPL and FCM techniques to detect unknown DDoS attacks.This combination is relatively new and remains unexplored.

3 Proposed IDS

Figs.1a and 1b show the architecture of our proposed IDS developed in this research.The architecture consists of three main modules:Data Preprocessing Module,OSR Module,and Unknown Detecting Module.The primary machine learning technologies include AlexNet,SLCPL,and FCM.AlexNet is a deep CNN architecture that takes on an essential role in the development and popularization of deep learning.The SLCPL is a novel OSR technique with outstanding advantages.Additionally,we utilized the FCM clustering technique to distinguish between known and unknown attacks.By including the clustering method,the classification performance is enhanced compared to using SPCLP alone.

Figure 1 a:Training phase

3.1 AlexNet

AlexNet,a Convolutional Neural Network,was developed by Krizhevsky and his team in 2012.It garnered significant acclaim for its exceptional performance in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) competition [8].This network architecture marked an important milestone at the time,having a major influence on the advancement of deep learning and the development of the task of image classification.

The main characteristics of AlexNet lie in its combination of depth,convolution,and pooling layers,along with a large number of trainable parameters.Its architecture is as follows:

• Convolutional and Pooling Layers:AlexNet utilizes convolutional and pooling layers to extract features.The layers efficiently capture specific characteristics within images and employ pooling layers to decrease the complexity of feature maps while preserving important features.

• Activation Function: The (Rectified Linear Unit) ReLU activation function is applied by AlexNet after each convolutional layer to mitigate the issue of vanishing gradients and accelerate the training process.

• Dropout: The dropout technique is introduced in the fully connected layers of AlexNet,mitigating overfitting and enhancing the model’s generalization capability.

• Multi-GPU Training:AlexNet pioneered the use of multiple GPUs for training in deep learning models,significantly accelerating the training process.

The success of AlexNet underscores the formidable capabilities of deep learning in tasks like image classification.While more profound neural network architectures emerged in the subsequent years,AlexNet’s role as a starting point in deep learning has cast a lasting influence on the design and development of subsequent models.Fig.2 describes the architecture of the AlexNet network.

Figure 2:AlexNet architecture

3.1.1 1D AlexNet

In our model,we adapted the architecture of AlexNet using one-dimensional convolution.The structure of 1D AlexNet is similar to 2D AlexNet,and the main difference is that 1D AlexNet processes one-dimensional input data.In this research,we have modified the model of 1D AlexNet as shown in Fig.3.

The model includes some important components as follows.First,convolutional layers use convolution to capture features from one-dimensional data.Next,activation function classes introduce nonlinearity,using the ReLU function.Subsequently,pooling layers are incorporated to diminish the dimensions of the feature map and the computational load,while preserving crucial features.Here,we utilized two kinds of pooling layers,Max Pooling and Average Pooling.

Additionally,1D AlexNet includes fully connected layers to map features to the end goal of the classification.Normally,the output layer utilizes the SoftMax function to produce the final classification result.However,in this research,we used the SLCPL technique for classification during the training process as well as the evaluation process.Preceding a fully connected layer,a dropout layer is employed to alleviate overfitting.Finally,the proposed modified model is the best result of our multiple testing iterations.

3.2 Spatial Location Constraint Prototype Loss

Spatial Location Constraint Prototype Loss (SLCPL) [6] is an OSR technique,a loss function designed for the Convolutional Neural Network model.The SLCPL extends from the Generalized Convolutional Prototype Learning (GCPL) [31] by introducing spatial location constraints to solve a common issue faced by most current training methods,such as SoftMax and GCPL.The problem lies in the concentration of known features towards the center of the feature space,resulting in the overlapping of known and unknown feature distributions.To prevent this phenomenon,a spatial location restriction is incorporated into the loss function while undergoing the prototype learning procedure.This allows SLCPL to manipulate the spatial positioning of the prototypes.Consequently,the known features tend to be located in the periphery of the feature space,while the center of the feature space is typically reserved for unknown features.

We will describe the SLCPL function as follows:

Given a training set S={(x1,y1),(x2,y2),(x3,y3),....}with N known classes,and the labelxiisyi∈{1,2,3,...,N}.A CNN is a classifier with the parametersθand the embedding function F.The prototypes O={Oi,i=1,2,3,...,N}are initialized randomly or with a distribution.

With a training sample(x,y),the SLCPL loss function can be represented by:

HereLG(x,y;θ,O)is a GCPL loss function andslc(O)is a spatial location constraint introduced by SLCPL.

LG(x,y;θ,O)can be represented as follows:

The functionl(x,y;θ,O)will be optimized to cluster different known classes.Andl(x,y;θ,O)can be represented as follows:

whered(F(x),Ok)is the Euclidean distance betweenF(x)andOk.

The paramλand the constraintpl(x;θ,O)is used to enhance the compactness of the cluster.

wherexkis the training sample of class k.

The spatial location constraintslc(O)can be represented as follows:

To clarify,slc(O)represents the variability of the distancesribetween the prototype of the cluster and the centerOc.

As presented,the time complexity to compute the SLCPL loss function for each data sample is O(N.D),where N is the number of known classes and D is the dimensionality of the feature space.

Finally,SLCPL is a new OSR technique with the ability to enhance the accuracy for classifying known and unknown classes.And SLCPL and 1D AlexNet are a promising combination for our OSR module.

3.3 Fuzzy C-Means

Fuzzy C-Means(FCM)[7]is a clustering analysis method to partition samples into multiple fuzzy categories.The algorithm considers the degree of membership of each sample to every category rather than just employing traditional hard clustering methods.In this research,we integrated FCM with OSR modules including SLCPL and 1D AlexNet.The 2-feature output from the OSR module will be clustered by FCM for Unknown DDoS attack detection.

We will describe the FCM algorithm as follows:

Given N samples and C cluster centers,each sample is denoted asxi,and each cluster center asvj.Define the fuzzy membership matrix U,whereUijis the degree of membership of the samplexito cluster centervj.The objective function of FCM is to minimize the following cost function:

Here,m is the fuzziness param that controls the degree of fuzziness in clustering.This objective function combines the weighted total of Euclidean distances between the samples and the cluster centres.The algorithm operates in the following manner:

i.Initialize the fuzzy assignment matrix U and cluster centers V.

ii.Perform iterative updates until a stopping condition is met(the maximum number of iterations or convergence of the objective function):

a.Update U:Compute the degree of membership of each sample to each cluster center using the membership update formula:

b.Update V:

iii.Returns the final fuzzy assignment matrix U and the cluster centers V.

As presented,the time complexity of FCM typically is O(I.N.C.D),where I represents the number of iterations,N represents the number of data samples,C represents the number of clusters,and D represents the dimension of each data sampling.

The core principle of this algorithm lies in iteratively updating the fuzzy assignment matrix and cluster centers,continuously adjusting the membership degrees of samples and the positions of cluster centers to minimize the objective function.Through the fuzzy assignment matrix,one can obtain the fuzzy membership degrees of each sample concerning each cluster center rather than solely obtaining rigid classification outcomes.

4 Experiments and Results

4.1 Dataset

In this research,we utilized the CICIDS2017 [32] and CICDDoS2019 [33] datasets to evaluate the proposed IDS.The CICIDS2017 and CICDDoS2019 are widely used datasets for IDS research and development.Developed by the Canadian Institute of Cyber Security(CIC),they aim to simulate various cyber attacks and common traffic in real-world networks.The simulation environment is set up with complete network topology,actual network traffic,and attack behaviors.They are used for various purposes,such as training,validating,and testing IDS performance.Both are valuable for many tasks like feature selection,model optimization,algorithm comparison,and other intrusion IDSrelated research.

The CICIDS2017 dataset was collected from 9 AM on Monday,July 03,2017,to 5 PM on Friday,July 07,2017,spanning a duration of 5 days.The CICIDS2017 dataset encompasses a diverse range of attacks,such as Brute Force,DoS,Web Attack,Infiltration,Botnet,Heartbleed,and DDoS.In addition,the CICDDoS2019 dataset includes both benign traffic and the latest prevalent DDoS attacks such as DNS,LDAP,NETBIOS,and SNMP.

We utilized the CICIDS2017 Wednesday dataset to train the model in the context of known attack detection.In addition,we utilized CICIDS2017 Friday and CICDDoS2019 datasets in the context of unknown attack detection.Table 2 describes the structure of the datasets we used in this research.

Table 2:Statistics of experimental datasets

4.2 Data Pre-Processing

Data pre-processing is a very important phase in conducting machine learning experiments.Preprocessing ensures the accuracy and reliability of the output.In this research,this phase includes Data cleaning and Feature transformation.

4.2.1 Data Cleaning

Data cleaning is a necessary process during preprocessing.Error data will be corrected to ensure the experiment produces the best and most reliable results.In this research,we clean the data by using the following methods:

•For data samples whose value of any feature is NAN,we proceed with elimination.

•For features whose value is INF,we replaced the value with 10E10.

•For features with negative values,we replaced the negative value with 0.

4.2.2 Feature Transformation

In this research,we have referred to the processing methods of Chapaneri et al.in the article[21]during the data transformation process.The feature value is transformed according to the following formula:

After data are transformed,the values of all features will be in the range[0,1].

4.3 Experimental Environment

In this research,we utilized a workstation with the Ubuntu 20.04 OS.Our workstation is equipped with an AMD Ryzen 5700X 8C16T processor,96 GB DDR4 memory,and Nvidia RTX3070 and Nvidia RTX2060 as computational accelerators.Regarding the model framework,we undertake development using PyTorch 1.11.0,Sklearn,and Python 3.9.12.

4.4 Model Training

In this research,we have built the 1D AlexNet model using the Pytorch framework and calculated the evaluation metrics using Sklearn.To evaluate the model’s effectiveness,we trained the model ten times using a different random number each time.We chose Adam as the optimizer.The detailed settings of the parameters are described in Table 3.And Fig.4 shows the network model of 1D AlexNet.

Figure 4:1D AlexNet model architecture

Table 3:Training parameters

4.5 Evaluation Metrics

In machine learning and pattern recognition,evaluation metrics serve as quantifiable measures to gauge model performance and effectiveness.These metrics assess and compare a model’s performance on a specific task or problem.In the context of this research,we used the following metrics:

• Accuracy:Evaluates the accuracy of model predictions.

• Precision:Assess the ratio of true positive predictions to the total predicted positive samples.

• Recall:Assess the ratio of true positive predictions to the total true positive samples.

• F1 score:A comprehensive evaluation metric by considering both precision and recall.

These metrics will measure the model’s performance in distinguishing between different classes.Table 4 is the definition of the confusion matrix in machine learning.

Table 4:Confusion matrix

4.6 Known DDoS Detection

For known DDoS detection,we utilized CICIDS2017 Wednesday dataset for the training model.Initially,we trained the 1D AlexNet model and visualized the 2-feature output from the 1D AlexNet module in feature space,as shown in Fig.5.We observed that the training process made the distribution of the known classes more dispersed.In some cases,the distribution areas of known classes have overlapped with one another.

Therefore,we continued to modify 1D AlexNet several times to improve the result.We have tried to increase the number of convolutional layers of the 1D AlexNet.After numerous improvements,we achieved the desired outcomes.Fig.6 shows that the result of our final model (Fig.3) improves significantly compared to the initial model in terms of the distribution of known datasets.

Furthermore,we obtained excellent results as shown in Fig.7,when evaluating known attacks by using the same CICIDS2017 Wednesday dataset.All evaluation metrics exceed 0.98.

Figure 6:Final 1D AlexNet model with fully connected layer

Figure 7:Known attack evaluation matrix

4.7 Unknown DDoS Detection

4.7.1 Unknown Attack Detection with 1D AlexNet

The 1D AlexNet model has shown good performance against known attacks.Next,we continued to evaluate the ability to defend against unknown attacks.Firstly,we used the CICIDS2017 Friday dataset to evaluate the trained model.Table 5 shows the results and correlation comparisons.

Table 5:Unknown DDoS detection performance for the CICIDS2017 dataset

When evaluating by CICIDS2017 Friday dataset,the accuracy is 0.7910 and other metrics are mostly at an average level.Because the CICIDS2017 Friday and CICIDS2017 Wednesday datasets are collected in the same environment.They have a certain similarity,especially in the BEGIGN traffic.However,the results show that the standalone model is not good enough to identify novel attacks.

Next,we used the CICDDoS2019 dataset to further evaluate the model.Table 6 shows the results and correlation comparison for the CICDDoS2019 dataset.

Table 6:Unknown DDoS detection performance for the CICDDoS2019 dataset

The results are really bad when evaluating the model by CICDDoS2019.Because the CICDDoS2019 and CICIDS2017 datasets are different.Even though they are collected by the same organization Canadian Institute of Cyber Security(CIC).They have different simulation environments and the collected features are not the same.The attack pattern of CICDDoS2019 is strange to the trained model.That fooled the model into thinking the novel attacks were regular accesses.Therefore,we need to combine the model with an Unknown DDoS Detecting module to enhance the system’s capabilities.

4.7.2 Unknown Attack Detection Module

In this research,we calculated the Outlier Detection Rate(ODR)to determine the efficiency of the Unknown DDoS Detecting module.The formula of ODR is as follows:

whereNOutlieris the number of unknown samples that our system believes should be reviewed by network experts,and N is the total number of samples in the procedure.

The CICIDS2017 Friday and CICIDS2017 Wednesday are collected in the same environment and are quite similar.Therefore,the ODR of CICIDS2017 Friday is only 0.082.However,other CICDDoD2019 datasets have very high ODR of over 0.99.Table 7 shows that our Unknown DDoS Detecting module has very high performance when identifying outlier samples.

Table 7:The outcome of the Unknown DDoS detecting module

4.7.3 Incremental Learning

After being processed by the Unknown DDoS Detecting module,suspicious samples will be analyzed and labeled by network experts.Next,new labeled data will be used for the incremental learning process.This process improves the machine learning model without the need to retrain it from scratch.The incremental learning ensures continuous operation of the system as well as full updates on new real-life cyber attack patterns.Table 8 shows the outcomes of the model after incremental learning.

Table 8:Our model outcomes post incremental learning

After incremental learning,the model’s performance has significantly improved.For the CICIDS2017 Friday dataset,despite the ODR rate being only 0.0842,it has led to an increase in accuracy of approximately 0.2.For the CICDDoS2019 dataset,the measured metrics exceeded expectations when all surpassed 0.98.This proves the module combined FCM and SLCPL is effective in distinguishing known and unknown traffic.Most of the samples determined as unknown traffic are strange to the system and can enhance the system’s performance.Therefore,the system’s capability to detect unknown attacks is guaranteed at the highest level.

4.8 Comparison of the Proposed Method with the Existing Works

We used a lot of recent research in detecting DDoS attacks for the comparison.Chosen algorithms include Deep Neural Network with Genetic Algorithms(2023)[28],Multilayer Perceptron(2023)[29],and Convolutional Neural Network featuring Geometrical Metric (2023) [34].All algorithms were evaluated on the CICIDS2017 dataset for known attack detection.Table 9 shows the comparisions of our method with recent ML algorithms.

Table 9:Our method in the comparison with recent ML algorithms on CICIDS2017

The performance metrics of our proposed method are slightly larger than DNN-GA and MLP.They have accuracy,precision,and recall of approximately 0.99.Compared with the CNN-Geo algorithm,the proposed method has similar performance with negligible differences.

Next,we continued to compare with other algorithms in detecting unknown attacks.The algorithms we chose to compare include GMM(2021)[21],DBSCAN with Random Forest(DBSCANRF) (2022) [27],DBSCAN with SVM (DBSCAN-SVM) (2022) [27],One-Dimensional Deep High-Resolution Network with One-Class SVM (1D-DHRNet-OCSVM) (2022) [30],and CNN featuring Geometrical Metric(CNN-Geo)(2023)[34].All algorithms are trained on the CICIDS2017 dataset and evaluated on other datasets.The performance metrics of the overall system are listed in the Table 10.

Table 10:Our method in comparison with recent ML algorithms in unknown DDoS attack detection

Compared with DBSCAN-SVM and DBSCAN-RF methods,our method has much more outstanding performance metrics.Compared with GMM(2021),1D-DHRNet-OCSVM(2022),and CNN-Geo (2023),our method has slightly larger metrics.Combining the two cases of known and unknown attack detection,our performance is equal to or more outstanding than other methods.This proves that our research has caught up and surpassed recent outstanding research.

4.9 Discussion

The validity of a study is a concern that we prioritize throughout our research.For internal validity,we are meticulous in applying methods,algorithms,and experimental procedures in the fields of machine learning and deep learning.We have used multiple random seeds,as well as dropout techniques to avoid overfitting,during both model training and evaluation.Regarding external validity,in this research,we utilized two main datasets,CICIDS2017 and CICDDoS2019.The validity of these datasets will significantly impact the applicability of our model in real-world scenarios.These are two well-known and standard datasets in the broader field of network attack detection,including specific DDoS attacks.All of these efforts are aimed at ensuring the reliability of our research.

However,new attack methods and their variants are continually evolving.Deploying an unknown attack detection module along with ongoing support from network experts and machine learning incremental learning techniques forms a robust combination that enhances the system’s capabilities over time.While the defense of our system has demonstrated effectiveness against a variety of DDoS attacks,it is essential to recognize certain limitations.This is particularly evident when confronting Layer 7(L7)DDoS attacks and the intricate strategies employed in adversarial scenarios.

As mentioned,our proposed model heavily relies on two primary datasets,CICIDS2017 and CICDDoS2019,focused predominantly on DDoS attacks at Layer 3 and Layer 4.This specialization in lower network layers may result in inadequacies in addressing the distinct characteristics of DDoS attacks at the L7 layer,the Application layer.Notably,L7 DDoS attacks pose a unique challenge as they exploit vulnerabilities within applications,services,or protocols,diverging from conventional approaches that target network infrastructure.Detection mechanisms may encounter challenges in discerning benign from harmful activities at this granular level.Techniques such as encryption-based intrusion and complex application-layer exploitation,while effective,may incur significant processing costs on lightweight systems,impacting overall performance and response times during high-intensity attacks.

Furthermore,our system has not been studied to face adversarial DDoS attacks,a major evolving threat.These attacks dynamically adjust in attack types,IP address manipulation,and various attack vectors,deliberately employed to complicate defense mechanisms.

In summary,minimizing L7 DDoS attacks and adversarial DDoS attacks requires a sophisticated and adaptive approach,while acknowledging the limitations mentioned above.Traditional mitigation strategies may not be sufficient,and research on advanced techniques needs to be conducted and implemented.

5 Conclusion and Future Work

Today,business service providers are striving to achieve stable service quality,which has become their important goal.However,some individuals view DDoS attacks as a way to generate revenue.DDoS attacks are becoming increasingly diverse and complex.Meanwhile,intrusion detection systems trained on limited datasets find it difficult to identify novel,unknown attacks.To solve this challenge,our research proposes a hybrid approach that combines the characteristics of supervised and unsupervised techniques.We have adopted the 1D AlexNet network as our main technique.We improved the 1D AlexNet network several times to achieve the desired model.Additionally,the combination of FCM and SCLPL was successful in identifying unknown attacks with very high efficiency.Then,we ask the help of network experts to label suspicious traffic and use reinforcement learning methods to enhance the model.As a result,the improved 1D AlexNet model achieved very high performance,with an accuracy of up to 99.8% for known attacks and 99.7% for unknown attacks.The proposed system has demonstrated strong defense capabilities against both known and unknown attacks.

Currently,the unknown detection method by using SCLPL and FCM is not only compatible with the 1D AlexNet model but can also be extended to other CNN-based models.This opens up research opportunities for the future.We plan to do more research with more complex CNN models as the core of the system to replace 1D AlexNet.Alongside that,we will try the integration of other unknown detection methods.The diversity of unknown detection methods ensures that our system is sensitive enough to detect novel attacks.All these methods aim to ensure the strong defense capabilities of the IDS system against the complex developments on the internet.

Acknowledgement:We would like to thank the National Science and Technology Council,Taiwan for funding this research.Additionally,we appreciate the financial support from the Institute of IoT Cybersecurity granted with ICP DAS Co.,Ltd.

Funding Statement:This research was partly supported by the National Science and Technology Council,Taiwan with Grant Numbers 112-2221-E-992-045,112-2221-E-992-057-MY3 and 112-2622-8-992-009-TD1.

Author Contributions:Conceptualization,C.-S.Shieh;methodology,T.-L.Nguyen and H.Kao;software,T.-L.Nguyen;validation,T.-T.Nguyen;visualization,H.Kao;writing—draft manuscript preparation,T.-L.Nguyen;writing—review and editing,T.-T.Nguyen and C.-S.Shieh;project funding,C.-S.Shieh and M.-F.Horng;project supervision,M.-F.Horng;project administration,M.-F.Horng.All authors have reviewed and approved the published version of the manuscript.

Availability of Data and Materials:Data supporting the reported results are available upon request.

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.