Hidden Markov model based epileptic seizure detection using tunable Q wavelet transform

2020-09-21DebaPrasadDashMaheshkumarKolekar

THE JOURNAL OF BIOMEDICAL RESEARCH 2020年3期

Deba Prasad Dash, Maheshkumar H Kolekar

Department of Electrical Engineering, Indian Institute of Technology Patna, Bihta, Bihar 801103, India.

Abstract Epilepsy is one of the most prevalent neurological disorders affecting 70 million people worldwide. The present work is focused on designing an efficient algorithm for automatic seizure detection by using electroencephalogram (EEG) as a noninvasive procedure to record neuronal activities in the brain. EEG signals'underlying dynamics are extracted to differentiate healthy and seizure EEG signals. Shannon entropy, collision entropy, transfer entropy, conditional probability, and Hjorth parameter features are extracted from subbands of tunable Q wavelet transform. Efficient decomposition level for different feature vector is selected using the Kruskal-Wallis test to achieve good classification. Different features are combined using the discriminant correlation analysis fusion technique to form a single fused feature vector. The accuracy of the proposed approach is higher for Q=2 and J=10. Transfer entropy is observed to be significant for different class combinations.Proposed approach achieved 100% accuracy in classifying healthy-seizure EEG signal using simple and robust features and hidden Markov model with less computation time. The proposed approach efficiency is evaluated in classifying seizure and non-seizure surface EEG signals. The system has achieved 96.87% accuracy in classifying surface seizure and nonseizure EEG segments using efficient features extracted from different J level.

Keywords: electroencephalogram, epilepsy, seizure, tunable Q wavelet transform, entropy, hidden Markov model

Introduction

EEG is a noninvasive low-cost technique used to record electrical activity of the neurons and detect neurological disorders such as seizure and dementia[1].Importantly, epilepsy is one of the neurological disorders affecting 70 million people worldwide. The seizure patterns vary a lot depending on the type of epilepsy. The present seizure detection technique in hospital is manual and detection accuracy depends largely on the doctor's expertise. There is a need for automatic seizure detection software which can improve the accuracy and reduce the time of seizure detection.

Detecting interseizure can help clinicians predicting seizure when a clear seizure pattern is not present in the EEG signal. A lot of work is being done in this area. Different authors have used wavelet transform[2–3]and empirical mode decomposition(EMD) in seizure detection. Empirical mode decomposition technique was used to decompose signals. Features such as Higuchi's fractal dimension,collision, Shannon and minimum entropy features were extracted from each intrinsic mode function(IMF)[4]. EMD-IMF coefficients along with multilayer preceptron classifier were used for seizure detection[5].Overall the system achieved only 95.42% accuracy.The use of discrete wavelet transform and its different variants such as dual tree complex wavelet transform in feature extraction are found to be useful in seizure detection[6]. Features extracted for intrinsic mode function obtained after EMD decomposition and its variants such as complete ensemble empirical mode decomposition are significant markers of seizure[7].Features extracted directly from EEG can also be used for efficient seizure classification. Features such as approximate entropy, Lempel-Ziv complexity, sample entropy, and symbolic entropy are significant seizure indicators[8]. Time domain features mean, maximum,minimum amplitude, and moments such as variance,skewness, and kurtosis can be suitable features for seizure classification[9]. Simple distance features such as Bhattacharyya distance were proposed as efficient seizure indicators[10]. Jae-Hwan Kanget al[11]proposed spectral power of Hjorth mobility components for seizure identification and quadratic discriminant analysis for classification. Wavelet based fractal analysis can be a good indicator of seizure[12].Combined time and frequency domain features along with quadratic classifier were evaluated for seizure detection and achieved 98.7% accuracy[13]. Discrete wavelet transform along with scattered matrixes for dimension reduction and quadratic classifier achieved 99% accuracy[14]. Support vector machine, Bayesian probabilistic classifier[15], and support vector machine classifiers are largely used in epilepsy detection[16].

Apart from wavelet transform and EMD, empirical wavelet transform also has found its good application in seizure detection. It has been applied for extraction of efficient features for detecting seizure events from surface EEG signal[17]. Orthonormal triadic wavelet based wavelets along with statistical features and KNN classifier achieved good accuracy[18].

But there is a risk of over-training and also these classifiers require a large number of database for training. Hidden Markov model (HMM) for seizure detection is preferred in this work as it requires less sample for training. Also, we have observed very little work in seizure detection using HMM based classifier.In our previous work[19], HMM based classifier approach with entropy features and Hjorth parameters was proposed. In the present work, we have used TQWT based features and feature fusion technique to evaluate seizure detection accuracy for eight different class combinations. The present approach aims at designing a system capable of achieving good accuracy with low training samples. The proposed approach focuses on designing automatic HMM based seizure detection algorithm for intracranial EEG signals. By using a small number of dataset (20%), we have achieved the accuracy up to 100%. Simple and efficient features are extracted and feature fusion approach is used to generate single fused feature set.We have presented detailed analysis of contribution of each selected feature for classification. The proposed HMM based approach achieved good accuracy in seizure classification as compared to other support vector machine and bagging state-of-art methods. The proposed approach efficiency is evaluated for surface EEG signal. High accuracy is observed in classifying seizure and non-seizure EEG classification.

Materials and methods

Dataset 1

Online EEG database is used for training and evaluation of the proposed approach. Intracranial EEG signal obtained from online database[20]is divided into 5 datasets consisting of seizure, healthy and interseizure EEG signals. Sampling frequency is 173.61 Hz with spectral bandwidth of the acquisition system between 0.5 Hz to 85 Hz. The total duration of signal is 23.6 seconds and has a total number of 4 097 samples. Two set A and B are surface EEG signals recorded from healthy person in eye closed and eye open condition. Set C and D are intracranial EEG signals and set E consists of seizure EEG segments.Each set consists of 100 EEG segments. Autoclassification accuracy for eight different cases is presented in this work such as healthy-seizure classification (A-E, AB-E), seizure-interseizure classification (C-E, D-E, CD-E) and seizurenonseizure classification (AC-E, ABC-E, ABCD-E).

Dataset 2

Surface EEG seizure dataset[21]is used for testing the proposed approach. Online EEG database is used for experimentation which was collected at the Children's Hospital, Boston. The subjects had drug resistant seizure. A total of 23 subjects' EEG signals were collected to verify surgery necessity after discontinuing drug intake inpatients. EEG sampling frequency was 256 Hz with 16-bit resolution. For most of the subjects, there were 23 or more EEG files.International 10-20 EEG system was followed for recording EEG signals. In this work the entire dataset is used for evaluation accuracy of the proposed model.Uniformity is maintained by selecting 18 channels common to all subjects.

Proposed approach

Initially EEG signal is bandpass filtered between 0.5 Hz–45 Hz. Filtering in this range automatically removes the power line interference and higher frequency noises in the signal. Saturation artifacts if any is removed from continuous data by subtracting the next segment of EEG from previous segment and if result obtained is zero then the segment is removed.TQWT is used to decompose signal into various subbands and Shannon entropy, collision entropy,transfer entropy, conditional probability, and Hjorth parameters are extracted from each band in frequency and time domain. Features are extracted from differentQandJvalues (2, 10), (3, 10), (6, 30), (10, 30) and(20, 30). Redundancy factorris fixed at 3 for allQandJcombinations. The best features and correspondingQfactor for the present problem are selected by Kruskal-Wallis test. Discriminant correlation analysis (DCA) is used for feature fusion.Hierarchical clustering divides feature set into different clusters and cluster number for each feature vector is used as symbol sequence to train HMM.Baum Welch algorithm is used to train two state ergodic HMM where each state represents seizure,interseizure or healthy EEG signal. In this work time and frequency domain fused features are used to train the system. During testing phase, test symbol sequence is generated by evaluating the nearest train feature to test feature vector and assigning cluster number of train vector to that test vector. During training of the proposed model using surface EEG signal only 90 seconds EEG is used. Features are extracted from every 5 seconds EEG. Training and testing of the model followed the same procedure as that of intracranial EEG signal. The detailed description of training and testing approach is given inFig. 1andFig. 2, respectively.

Techniques

Tunable Q wavelet transform

TQWT is based on the concept of discrete wavelet transform with quality factor (Q) adaptability[22–23].TQWT has the property of fast decomposition and perfect reconstruction which makes it suitable for application in many biomedical signal processing problems. The transform can be implemented efficiently with fast Fourier transform (FFT). Radix 2 FFT can be used to decrease computation time of TQWT. HighQfactor is desirable for processing oscillatory signals as compared to non-oscillatory signals. HigherQfactor results in more oscillating wavelet. The TQWT is developed using perfect oversampled filter bank with real valued scaling factor. Signals are processed through both low pass and high pass filters and the output of low pass filter is further decomposed up to preselected decomposition levelJ. The sampling frequency of output signal from low-pass and high-pass filter varies asα*Fsandβ*Fsrespectively. Sampling frequency is defined asFs.Mathematical expressions are given below.

Fig. 1 Proposed HMM training approach. SE: sample entropy;CE: collision entropy; TE: transfer entropy.

Fig. 2 Proposed HMM testing approach.

Redundancy factorris defined as:

QandJare defined as:

WhereWcis the center frequency BW is the frequency bandwidth andNis the total length of the signal. The frequency response of low pass and high pass filter expressions are defined as shown in equation 5 and 6. Here Daubechies frequency response is defined as follows.

Entropy features

In this work entropy features[24]parameters are used for differentiating seizure, interseizure and healthy EEG pattern. EEG signals are segmented into two parts. Shannon entropy and collision entropy are extracted from each EEG segment in frequency and time domain. Based on conditional probability,transfer entropy is evaluated. The data is divided into two parts and transfer entropy is evaluated between them. Entropy features are preferred as it represents the randomness in the signal. The spectral entropy indicates the frequency variation of signal. Transfer entropy indicates the dynamics of the system.

Hjorth parameter

This feature[25–26]indicates the shape information of EEG signal. It has three different parameters namely activity, mobility, and complexity. Activity is the measure of variance of the signal indicating the spread of the data. Mobility and complexity are derived from the variance of the data.

Mathematically transfer entropy (TE), conditional probability (CP), Shannon entropy (SE), collision entropy (CE) and Hjorth parameters for Fourier transform of EEG signalX(w),Y(w) and EEG signal segments x(n), y(n) are presented inTable 1.P[X(w)]andP[x(n)]represent the probability of repetition of each signal element.Xcrepresents the complete EEG signal andxc'(n) andXc'(w) represent the first derivative of the EEG signals and the first derivative of Fourier transform of the EEG signals, respectively.τ represents the time delay in the signal. Here time lag is 50% overlapping between signal segments.

Feature selection and fusion

Significant entropy features are selected from frequency and time domain and fused to form efficient feature set. Kruskal Wallis test[27]is used for selecting efficient features at different subband levels. ThePvalues of features extracted from each TQWT subband are compared and the feature from the most significant subband is selected for further processing.Weight factor is selected as 0 or 1 based on feature efficiency. DCA is applied only when Shannon and collision entropy features are significant. Kruskal Wallis test is a nonparametric ANOVA test and an extension of the Wilcoxon rank sum test to analyze more than two groups. It tests the hypothesis of difference between median of two groups. Best features are fused using DCA[28]and combined linearly to form the efficient feature set.

Hierarchical clustering and classification

Clustering

Clustering involves assigning data points to different groups based on similarity. Similarity of points is measured by the minimum distance,connectivity or intensity. Clustering of data points is achieved by an iterative process. In this work agglomerative hierarchical clustering[29–30]is used to divide data into different groups. Agglomerative clustering is performed by finding the similarity between different objects. In this work, Chebyshev distance is used as measure of similarity of data.Based on similarity total dataset is divided into binary hierarchical cluster tree. Then based on maximum cluster number the tree is divided. In this work, entire dataset is divided into three clusters. Each cluster number act as symbol for feature vector. Cluster number is selected based on total class to be classified.

Classification

HMM[31–33]is a probabilistic classifier based on Markov chain rule. Baum Welch algorithm is used for training and Viterbi algorithm is used for testing the model[34]. In the proposed approach binary ergodic HMM is used for modeling seizure, healthy and interseizure HMM. Self transition is considered higher in initial transition matrix. Emission matrix is calculated by finding probability of particular symbol in each state feature set. Twenty intracranial EEG signal segments are used for training and 80 EEG signal segments from each set are used for testing the HMM algorithm. Viterbi algorithm is used to find the most probable state of the system given the symbol sequence. The final state output of the test symbol sequence is the classified state of the EEG signal. True positives (TP) and true negatives (TN) per class combination are calculated for analyzing classifier efficiency. Mathematically accuracy, specificity and sensitivity are defined as follows.

WhereFP=false positive,FN=false negative

Table 1 Entropy and Hjorth parameter estimation

Proposed approach is also verified on surface EEG database. A total of 10 non-seizure and seizure EEG segments are used for training of the HMM model.

Results

Classification

Table 2and3shows the final classification accuracy observed based on proposed automatic seizure detection approach. Healthy-seizure (A-E)EEG signal is perfectly detected for both time and frequency domain features. Combined eye open and eye closed class is classified from seizure segment(AB-E) with 99.58% accuracy. In class AB-E one seizure segment is misclassified as healthy EEG signal in both time and frequency domain. Sensitivity is 98.75% and specificity is 100.00% in classification of AB-E. Cluster plays an important role in classification in the proposed approach. The clusters created after agglomerative hierarchical clustering are significantly different, resulting in good classification. Interseizureseizure (C-E, D-E) is classified with 100% accuracy for time domain features. For the entire above classes,quality factorQ=2 resulted in good accuracy. Binary classifier is used to achieve good classification accuracy. Kruskal Wallis testPvalues indicated the feature efficiency of combined features. As observed fromTable 2and3fused feature for D-E is more efficient in time domain as compared to frequency domain features. Classification accuracy achieved for AB-E and AC-E are comparable in both time and frequency domain features. Seizure-interseizure (CDE) class achieved good accuracy in time domain features. Fused feature for class ABC-E is more efficient in frequency domain. Proposed approach classifies seizure-non-seizure with 97.00% accuracy for higherQ(Q=10). Overall it is observed that proposed approach achieved good accuracy forQ=2 and 3. In all classes, higherQvalue resulted in reduction of the performance of the proposed system.HigherQvalue is not suitable for intracranial EEG seizure classification. It is observed that the CD-E HMM designed with frequency domain features has low classification rate (Table 3). This can be realized by the fact that the cluster created for each group has overlapping region which resulted in low accurate classification model. D-E classification model with frequency domain features has the lowest classification accuracy with only 47 out of 80 seizure signals being correctly classified.

In CD-E classification, TP detection is 26 out of 80 segments.Fig. 3represents the box plot after ANOVA test between D-E for time domain features after feature fusion. There is a clear median value difference between the two classes. Classifier with the highest accuracy is further analyzed for individual feature accuracy.

Analysis of features for highest classification accuracy

Ten different features were extracted from different TQWT subbands and features with maximum significance are considered for classification. Features from bothx(n) andX(w) are extracted and evaluated to achieve good classification accuracy. For healthyseizure (A-E) classification entropy features extracted fromx(n) were found to be most significant in classification. Lower subbands have higher accuracy in classification as shown inTable 4. The maximum accuracy was observed forSEtandCEt. Hjorth parameter activity and mobility are not significant.Transfer entropy is significant in seizure-healthy classification.Fig. 4represents the agglomerative hierarchical clustering in healthy-seizure for transfer entropy feature. Both the classes fused feature has unique cluster for each class indicating good feature separation transfer entropy feature withQ=2,J=10 and that is reflected with 100% classification accuracy.Qfactor 2 achieved the best classification accuracy for both time and frequency domain features.In class AB-E similar feature pattern to that of A-E is observed. Interseizure-seizure classification (C-E)Hjorth parameters such asmobilitytandcomplexitytare found to be significant withQ=2.TEf, CEf, andCPffeatures are significant in differentiating interseizure-seizure EEG signal. Hjorth parameters were not significant in frequency domain. In classifying D-E classSEtandCEtwere found to be significant. Hjorth parameter failed to classify D-E both in time and frequency domain. ForQ=2 andJ=10, HMM achieved good accuracy with featuresfromx(n) extracted from subband in the range 6–7. In combined class, CD-ECPtandTEtwere observed to achieve good classification accuracy. In AC-E classification,TEtis significant compared to other features.CPtandCPfare observed to achieve good accuracy in ABC-E class classification as shown inTable 4. Maximum accuracy is observed atQ=2, 3. In this class combination, Hjorth parameters failed to classify efficiently for allQvalues. Features such asCPfandSEtare efficient in classifying seizure and nonseizure (ABCD-E) EEG signal. In class ABCD-E,themobilityffeature failed to recognize seizure completely. Therefore, the accuracy is not mentioned.It is observed thatQfactors 2 and 3 are most significant in classifying seizure, healthy and interseizure EEG signal. Features from upper subbands of TQWT were not significant in detecting seizure.

Table 2 Classification accuracy of different class combinations for time domain feature

Table 3 Classification accuracy of different class combinations for frequency domain feature

Fig. 3 D-E class box plot for time domain features after Kruskal Wallis test.

Model evaluation on surface EEG data

Complete surface EEG database was used for evaluating the seizure detection algorithm. EEG signal is decomposed using TQWT up to level 10 withQfactor andrfactor equal to 3. The time domain features are extracted from 5-second window at a time from continuous EEG seizure and nonseizure EEG signals. Each EEG channel is processed separately and features from all channels are combined to classify seizure EEG signals. Maximum of all entropy, Hjorth parameter, and conditional probability features extracted from every 1 minute of data are used for classification.Table 5gives the individual features classification accuracy at different J level. Bold letter indicates the highest accuracy for the feature at certain J level. As observed fromTable 5TEtachieved highest 81.50% accuracy for seizure detection at Jlevel 5. TheCEtachieved 90% accuracy in seizure detection. The lowest accuracy is observed for Hjorth parametermobilityandcomplexityfeature.Mobilityfeature achieved 66.06% accuracy at J level 7.Complexity feature of Hjorth parameter failed completely in classifying healthy and seizure EEG signal. Among Hjorth parameter activity feature achieved 78.32% accuracy at J value 9. The features with accuracy higher than 80% at defined J level such asTEtat J level 3,SEtat J level 2 andCEtat J level 3 are used together as feature set to classify seizure and nonseizure EEG segments. Proposed system achieved 96.87% accuracy indicating its efficiency in seizure detection from surface EEG signals.

Fig. 4 A-E class cluster in time domain.

Comparison with other state-of-the-art methods

Table 6gives a comparison between state-of-the-art methods and proposed approach for seizure detection using Bonn database. Our proposed HMM based approach is giving good accuracy as compared to SVM and bagging based approach in state-of-the-art methods. We have also compared our HMM based approach with that in our previous work[19]and observed higher accuracy. The accuracy has increased due to the application of TQWT features and feature fusion approach. The proposed approach is tested on MATLAB platform with core i5 system. The system took overall 0.28 seconds for processing and classifying the data. With increase in processing speed, the present system can be used for real time seizure detection. Simple features are used making the system simple and efficient.

Discussion

This research work deals with the HMM classifier based auto-seizure detection system. HMM, classifier deals with time-sequential activity, and in this research work, seizure and nonseizure events are considered as a sequential activity. The designed HMM classifier is a two-state ergodic model with transition probabilities between states evaluated using the Baum-Welch algorithm. Entropy features Shannon entropy, collision entropy, and transfer entropy are evaluated for effectiveness in seizure detection. Two different databases are used in this research work for training and testing of the seizure model. Above entropy, features are found to be useful in seizure detection for both intracranial and surface EEGdatabase. Entropy features represent the regularity of the signal. EEG signals have more regularity during seizure event as compared to nonseizure event. Apart from entropy features, conditional probabilities and Hjorth parameters are extracted from the EEG signal.The conditional probability feature represents the repetition of the event, and the Hjorth parameter represents the statistical information of the EEG signal.

All features are extracted from the EEG signal both in time and frequency domain. The wrapper featureselection approach is used for efficient feature selection. The feature selection and efficient features fusion reduced the feature dimension, thereby reducing the computation time. The accuracy achieved by features from both the time and frequency domain is comparable for different class combinations. The fused time-domain feature classified EEG class D-E with 100% accuracy and CD-E class with 97%accuracy.

Table 5 Classification accuracy of proposed model on surface EEG data for various J values

Table 6 Accuracy comparison of proposed approach with other state-of-the-art methods

The fused frequency-domain feature achieved lower accuracy in both CD-E and D-E class combinations.The fused time-domain feature is preferred for seizure detection. The efficiency of the proposed model is further evaluated using surface EEG signals, and the model performed satisfactorily in seizure detection.

Conclusion and future scope

An efficient HMM based automatic seizure detection system is proposed. Algorithm is efficient in seizure-healthy classification with 100.00% accuracy.Different feature efficiency are evaluated both in time and frequency domain. Entropy, and conditional probability features are efficient in classifying healthy-seizure EEG signals for different class combinations. Hjorth parameters are significant in detecting interseizure EEG signals. LowerQvalues achieved good classification results. The proposed model is evaluated on surface EEG database and achieved good result. Prediction of seizure is an active area of research. Proposed approach is computationally inexpensive and can be extended for real time intracranial EEG seizure detection. In future the proposed approach will be extended for detection of seizure and non-seizure segments from hospital dataset collected from Indian patients. Since probability based classifier requires less training we will evaluate their classification performance in future.