Denoising of EEG Signals by Combining Wavelet Packet Transform with FastICA Algorithm

2014-11-16陈宏铭,王远大,程玉华

生物医学工程学进展 2014年3期

0 INTRODUCTION

Electroencephalograms(EEG)signals embracing a large amount of physical and pathological information play an important role in clinical medicine study and disease diagnosis［1］. There are interferences that might vitiate the EEG signal such as influences related to cerebral activity that should be eliminated from the logging before various analyses［2］.The strength of EEG voltage signals in a healthy person is between 20 MV and 50 MV，which is too weak and is easily affected by all kinds of noise［3］.Therefore，it＇s most likely that there are lots of noise in EEG signals.It brings risks to doctors＇diagnosis process and analysis.

The main types of noise in EEG signals are white noise with Gaussian distribution，Electroculograms(EOG)，noise from vascular beating，power－ line interference，etc.These noises are almost as large as the EEG signal and get a very low SNR performance.As is known to all，the frequency of white Gaussian noise is much higher than that of EEG signal［4］.The white Gaussian noise can be decomposed by taking advantage of the character of wavelet transform－multi－ resolution［5］，and then set the highest frequency coefficients to zero.In this way，most of the white Gaussian noise can be removed，but a portion of the relevant EEG signal are removed at the same time.The result is not what we expected.

In this paper，WPT has been applied to keep the original EEG signals as much as possible，instead of wavelet transform to remove the white Gaussian noise［6］.WPT was first introduced by Coifman et al.［7］for offering a rich set of decomposition structures.It is a recapitulation of the dyadic wavelet transform(DWT)and related to a best selection algorithm.As to other non－random noise including EOG，vascular beating，power － line interference，their frequencies almost keep constant，wavelet transform can be used to remove the noise，too.Owing to the EEG signals consisting of high－frequency and low－frequency components，partial EEG data will be inevitably lost when using wavelet transform to remove non－random noise.

What＇s worse，the frequency range of the noise has to be recognized by using wavelet transform.The wavelet transform cannot be applied to remove external noises effectively.While the EEG signals can be collected by wearing a headset with sensors，some other unnoticed noise may be added.Compared with the wavelet transform，ICA is more suitable to remove non－random noise.The fundamental paper on the ICA was proposed by Comon［8］.Many ICA algorithms have been developed by researchers from different groups.TheFastICA algorithm wasproposed by scholars from Finnish， see［9］. It is a linear ICA algorithm with fast convergence and good accuracy［10］.

In this paper，FastICA algorithm has been used to remove all types of non－random noise.Just as its name suggests，high－speed operation with lesscomplexity has been achieved，which has an advantage for dealing with mass data.At the same time，its accuracy is no worse than that of other forms of ICA.By using this approach，different non－random noise can be assigned to individual channels.We don＇t need to know the frequency ranges of any type of noise，and when some unnoticed noise are added，the system can work as well without any modification.Nevertheless，the drawback of ICA is that it cannot tell which channel is the noisy－channel，and which one is the channel of EEG.To overcome this，a method of Quasi Expected Value(QEV)is proposed.The EEG signal from all FastICA outputs can be selected simply and effectively by QEV method.

1 METHODOLOGIES

The flow chart of sampling and processing EEG signals is shown in Fig.1 .First of all，EEG signals are collected from different area of the brain.To simplify the analysis eight is chosen as the channel number，and N，the sampling points in each channel.Thus，we can construct a matrix S(8，N)with 8 denoting the number of rows and N，the number of columns.Then，WPT algorithm is used to remove the random noise before ICA is applied，because the less the random noise is，the better the ICA algorithm will work.Next，FastICA algorithm is used to process all eight channels of WPT outputs.The outputs of FastICA are immune from random noise.They are generated from different independent sources.One of them is the original EEG signal and the others are different types of noise.The final step is to find out the original EEG signal from the channels by using QEV method.

Fig.1 Flow chart of sampling and processing from EEG signals图1 脑电信号的采样和处理流程图

1.1 WPT in Removing Random Noise

The Mallat fast decomposition algorithm is used to analyze EEG signal.EEG signal can be assigned into arbitrary frequency band by wavelet packet.All timE－frequency ingredients of EEG signal can be mapping into orthogonal spaces which assign different frequency band［11］.In wavelet transform，the signal(EEG)is decomposed into two parts:high－frequency(detail)and low－frequency components.The low－frequency component is decomposed into high－frequency and low－ frequency componentsagain.The procedure is repeated again and again.We can easily see that the wavelet transform is good for processing low－frequency component rather than high－frequency component.

A multi－scales analysis method for non－stationary signal processing is based on wavelet packet transform.Frequency band is assigned to multi － levels.Coefficients of high－frequency component(detail)and low－frequency component(approximate)are decomposed successively at each level to create a full binary tree.It enhances the time－frequency resolution of signal processing and makes EEG signal analysis more reliable.Because partial EEG signal includes the personal important health information.The detail of EEG signals need to be kept as much as possible［5］.

To preserve the detail，WPT algorithm is adopted to remove random noise in this paper.A wavelet packet decomposition tree of WPT algorithm［12］is shown in Fig.2 .Suppose EEG signal is in scalE－space S，the sketch map is decomposed into three scale－spaces.In this figure，"A"and"D"stand for low－frequency and high－frequency component of the signal，respectively.The DDD3 is set to zero to remove the random noise. Through experiments， the DB4 wavelets are chosen and the decomposing level is three.The EEG signal is decomposed into detail and approximation bands after Mallat decomposition and the information is of integrity.

Fig.2 Decomposition tree of wavelet packet图2 小波包树分解

1.2 FastICA algorithm in Removing Non －random Noise

Typically，the problem that ICA algorithm concerned can be described as below:suppose that S=［S1;S2;…;SN］is the original unknown multivariate signal matrix，X=［X1;X2;… Xn］is the observed signal matrix，and is transformed through the unknown linear mixing matrix A such that X=A*S.If we find a matrix B and get Ŝ via the equation Ŝ =B*X，in which Ŝ is the optimal value of S［13］.It means that we succeed in dividing the signals generated by different sources into different channels.

The basic requirements of ICA are listed below:

1)The targeted signalmustbe totally or approximately independent to all the noises in the observed signal.In other words，the cross－ correlation coefficients should be near to zero;

2)All the signals and noise including the targeted signals must be Non－Gaussian in nature;

3)The number of channels must be more than that of targeted signals and noise types.

The main types of noise in EEG signal are white Gaussian noise，EOG，noise from vascular beating and power－ line interference.After the application of WPT，white Gaussian noise is removed.All kinds of the remaining noise are Non－Gaussian，which meets thecondition.Because the remaining noise and targeted signals come from different sources，they are independent to each other，which meet the condition.After removing the white Gaussian noise，other four types of noise remain.If the channel number is greater than or equal to four，condition 3)will be met.

The mixing and de－mixing processes are shown in Fig.3 ，in which the de－mixing stage is ICA algorithm.Two steps are always adopted in the demixing algorithm:

Step 1.Whitening:A method which can make the variance of all the components zi(t)equal to one.

Step 2.Orthogonal transformations:A method which makes all components of y(t)independent to each other，meanwhile keeps the variance of y(t)unchanged.

Fig.3 The processes of mixing and de－mixing图3 混合和去混的过程

The FastICA algorithm(also known as fixedpoint algorithm)is a high－speed algorithm to determine the optimal value［14］.Two major optimal criterionsare maximum likelihood and maximum negentropy.In this paper，the maximum negentropy criterion is adopted.Fixed－point iteration is adopted in FastICA algorithm for the purpose of fast convergence.

Negentropy criterion is defined as below［15］:

ygis the Gaussian random variable having the same covariance matrix as y.The lower the absolute value of negentropy is，the more obvious the Gaussian character of y reveals.Only when y is a Gaussian variable，the negentropy value is equal to zero.According to(2)，the probability density distribution is needed in calculating H(y).However，it＇s generally rather difficult to find the probability density distribution.So equation (2) is replaceed approximately by［15］:

From(3)，E is the operational symbol of the expected value，and f is a non－linear function.There are several forms of function f.In this paper，the adopted form of f in equation(4)［15］is a very common non－linear function.

Thus the main task is to optimize the matrix W，so as to make Nf(WTX)minimum.

The iteration process can be simplified to the following steps［15］:

The iteration process won＇t stop until reaches its convergence condition in(7)，which is determined by the required precision.And the convergence condition in this paper is ‖WN‖≤10－4.

1.3 QEV method in Choosing EEG Channel among All Channels

Among all channels＇FastICA output signal Z(i，:)，one is the targeted signal and the others are different types of noise generated by different sources.For all channels＇WPT output signal X(i，:)，the targeted signal exists in all channels，but each type of noise substantially presents in the channel nearby the noise source.The channels which are far from the noise source contain this noise hardly.So the original EEG signal has relatively higher degree of correlation with all channels X(i，:)，and the degrees of the correlation between each noise channeland all channels X(i，:)are all much lower except for that between the noise channel and the channel near the noise source.So the expected value of correlation coefficients between original EEG signal and X(i，;)is higher than that between noise channel and X(i，:).By choosing the highest expected value，the EEG channel can be located among the FastICA output channels.It＇s a method of expected value(EV).

Through the experiments，the difference between the highest and the second expected value is not big enough and it＇s more likely to cause the poor results.In this paper，a method of quasi expected value(QEV)is proposed.A judgment can be reached after analysis，the correlation coefficients between the noise channel and all the channels of X(i，:)are all low value except for that between the noise channel and the channel near its source.The expected correlation coefficients excluding the maximum value will be much lower.On the other hands，if the minimum correlation coefficients between the original EEG and X(i，:)is removed，the expected value of the targeted signal will be higher.In QEV stage，the expected correlation coefficients between each FastICA output channel and X(i，:)can be calculated excluding the maximum and minimum values.

1.4 Criteria of Independent Signals

Cross－correlation coefficient is a measure of similarity between two signals in data processing.The larger the cross－correlation coefficient is，the higher the degree of similarity will be.Generally speaking，two signals from different sources are assumed statisticalindependent.Itmeans the degree of similarity is very small，if the cross － correlation coefficients between any two of the FastICA output channels are low in a certain range.Each signal of the FastICA output channels is from different sources.Therefore，cross－correlation coefficients are generally the criteria of signal independent.The formula for cross－ correlation coefficients is shown in formula(8)［15］，

where N is the sequence length and m=0，1，...，N －1.

The cross－correlation coefficient is a ratio of differences，in which positive or negative sign represent the direction of the cross－ correlation coefficient.The absolute value represents the degree of similarity.The perspectives on the correlation coefficient(in absolute value)are different in statistics，but commonly the degree of similarity is divided into four parts shown in Table 1.

Tab.1 Different Ways to Define The Degree Of Similarity表1 相关程度划分

2 EXPERIMENTAL RESULTS AND DISCUSSION

Thisexperimentaldatabase，collected atthe Children＇s Hospital Boston(CHB － MIT)，consists of totally 23 sets of EEG signals over 9 minutes long.The sampling frequency of the data is 256 Hz.The file format is edf which cannot be read by MATLAB tool.The method list below is used to convert the file format.

· Convert eeg.edf into eeg.txt by using software called EDFbrowser.The file format of text is ASCII.

· Analyze the contents of eeg.txt，i.e.this file consists of one group of timing information and 23 groups of EEG information.

· Write a program to read the eeg.txt into a vector(x，y1，y2，...，y23)with the MATLAB function called textread().

Six thousand sampling points are selected from a large data to construct the vector(y1，y2，y3，y4，y5，y6，y7，y8).A four－ dimensional data set X［y1，y2，y3，y4，y5，y6，y7，y8］is constituted to reduce computation time.The waveforms of the data set are shown in Fig.4 .The cross－correlation coefficients between any two of the observed signals are shown in Table2.The range of micro，real and significant correlation coefficients are from 0.02 to 0.69，which means the observed signals contain the noises generated from various noise sources.Two observed signals with high cross －correlation coefficient contain little or the same noise.However，two observed signals with a low crosscorrelation coefficient inevitably contain two noises from different noise sources.

The WPT output signal waveforms become clean and smooth as shown in Fig.5 .(Compared to Fig.4 ).It means that the white Gaussian noise has been restrained to some degrees.At the same time，there is plenty of high－frequency component in waveforms.It means the useful signal details has been retained.The cross－correlation coefficients between any two of the WPT output signals are shown in Table 3.The Micro，real and significant correlation coefficients can range from 0.02 to 0.68.It＇s obvious that the degree of similarity to WPT output signals decreases a little by comparing Table 3 to Table 2.It means the WPT can remove the noises，but the performance is not good enough.

The waveforms of FastICA output signals are shown in Fig.6 .It is more difficult to observe the difference between Fig. 4 and Fig. 6， but the statistical data are helpful to analyze.The cross －correlation coefficients between any two of FastICA output signals are shown in Table 4.The order of magnitude in Table 4 is 10－15or 10－16，which can be regarded as approximately zero.It means that FastICA output signals are independent. Therefore， the approach we adopt in this paper performs extremely well in removing the noise in EEG signal.

At last，the QEV method is adopted to find out the original EEG signal among the FastICA output signals.The waveform of QEV output signal is shown in Fig.7 .The process is shown in detailed as below:

·Work out all the cross－correlation coefficients between each FastICA output signal Z(i:)and each WPT output signal X(i，:)，of which the absolute values are shown in Table 5.

·Work out the quasi expected value of each column of Table 5，which is shown in Table 6.

By comparing the performance of EV and QEV methods with the expected values of each column of Table 5，the results are shown in Table 7.According to Table 6 and Table 7，both of the largest numbers are Z(8，:)，which means that the original EEG signal channel(the 8thchannel)have been obtained through QEV and EV methods successfully.In Table 7，the first and second highest numbers are 0.483 and 0.334，respectively.The difference between them is 0.149.While the first and second highest values are 0.455 and 0.346 in Table 7.The difference between themis 0.109.By comparing with EV method，the tolerance ability of QEV method is improved by 36.7%.The QEV method we proposed in this paper performs much better than the EV method.

Fig.4 The observed signals图4 所观察到的信号

Fig.5 The output signals of WPT algorithm图5 WPT算法的输出信号

Fig.6 The output signals of FastICA algorithm图6 FastICA算法的输出信号

Fig.7 The output signal of QEV method图7 QEV方法的输出信号

Tab.3 Correlation coefficients between any two WPT output signal channels表3 任何两个WPT的输出信号通道之间的相关系数

Tab.4 Correlation coefficients between any two FastICA output signal channels表4 任何两个FastICA的输出信号通道之间的相关系数

Tab.5 Correlation coefficients between observed signals and FastICA output signals表5 观测信号和FastICA的输出信号之间的相关系数

Tab.6 Quasi expected values of fastICA output signals表6 FastICA输出信号的准预期值

Tab.7 Expected values of fastICA output signals表7 FastICA输出信号的期望值

3 CONCLUSION

In this paper，the method of combining WPT with FastICA algorithm is proposed to remove all types of noise from EEG signals.Through the experiments from the data acquired in CHB －MIT，the order of magnitude to cross－correlation coefficients for all the output signals is 10－15or 10－16.The result shows that the method can remove almost all of the noise.The wavelet packet analysis is employed to decompose the EEG signal into layers.In order to find out the original EEG signal from FastICA outputs，we propose the QEV method.By comparing with the QV method， the tolerance of QEV method is improved by 36.7% .It＇s really a simple and practical method for denoising of EEG Signal.

4 ACKNOWLEDGEMENTS

This research was supported by the 863 National High Technology Research and Development Program of China(2013AA011202)and the National 02 Key Special Program(2009ZX02305－005)，the 863 National High Technology Research and Development Program of China(2013AA014102)and the National No. 2 Special Key Project Program (No.2012ZX02503005).

REFERENCE

［1］Zunairah Haji Murat，Mohd Nasir Taib，Sahrim Lias，et al.Establishing the fundamental of brainwave balancing index(BBI)usingEEG［C］. The 2ndInt. Conf. on Computional Intelligence，Communication Systemsand Networks(CICSyN2010)，Liverpool，United Kingdom，2010.

［2］Melia，Umberto，Francesc Claria，Montserrat Vallverdu，et al.Removal of peak and spike noise in EEG signals based on the analytic signal magnitude［C］.Annual International Conference of the IEEE Engineering in Medicine and Biology Society.San Diego，CA，Aug.2012.

［3］J.Yoo，L.Yan，D.El－ Damak，et al.An 8 － channel scalable EEG acquisition SoC with fully integrated patient－specific seizure classification and recording processor［C］.IEEE Int.Solid － State Circuits Conf.(ISSCC)Dig.Tech.Papers.San Francisco，CA，Feb.2012.

［4］M.Mollazadeh，K.Murari，G.Cauwenberghs，et al.Micropower CMOS－integrated low－noise amplification，filtering，and digitization of multimodal neuropotentials［J］.IEEE Trans Biomed Circuits Syst，2009，3(1):1 －10.

［5］罗志增，李亚飞，孟明，等.一种基于二代小波变换与盲信号分离的脑电信号处理方法［J］.航天医学与医学工程，2010，23(2):137－140.

［6］Dahshan EI，Sayed EI.Genetic algorithm and wavelet hybrid scheme for ECG signal denoising［J］.Telecommunication Systems，2010，46(3):209 －215.

［7］Coifman R，Meyer Y，Quake S，et al.Signal processing and compression with wave packets［M］.Numerical AlgorithmsResearchGroup， New Haven， CT:Yale University，1990.

［8］Comon P. Independent component analysis， a new concept?［J］.Signal Process，1994，36(3):287 －314.

［9］Hyvärinen A，Karhunen J，Oja E.Independent Component Analysis［M］.Wiley，New York，2001.

［10］Charayaphan C，Sattar F.Design of low － cost FPGA hardware for real－time ICA－based blind source separation algorithm［J］.EURASIP JApplSignal Process，2005，18:3076 －3086.

［11］Yan S，Zhao H，Liu C，et al.Brain－computer interface design based on wavelet packet transform and SVM［C］.International Conference on Systemsand Informatics(ICSAI2012)，Shanghai，China ，May 2012.

［12］Kharate GK，Patil VH.Color image compression based on wavelet packet best tree［J］.Int J Comput Sci Issues，2010，7(3):31－35.

［13］Shen H，Kleinsteuber M，H¨uper K.Local convergence analysis of FastICA and related algorithms［C］.IEEE Trans Neural Networks，2008，19(6):1022 －1032.

［14］Ye J，Huang T.New fast－ ICA algorithms for blind source separation without prewhitening［J］.Communicat Comput Informat Sci，2011，225(2):579 －585.

［15］杨福生，洪波.独立分量分析的原理与应用［M］.北京:清华大学出版社，2006.