A method for constraining the end effect of EMD based on sequential similarity detection and adaptive filter

2021-04-23WeiDongdongTangWencheng

Journal of Southeast University(English Edition) 2021年1期

Wei Dongdong Tang Wencheng

(School of Mechanical Engineering, Southeast University, Nanjing 211189, China)

Abstract：Aimed at the problem of the end effect when using empirical mode decomposition(EMD), a method for constraining the end effect of EMD is proposed based on sequential similarity detection and adaptive filter. The method divides the signal into many wavelets, and it changes the initial wavelet length to select the best initial wavelet that has the minimum error and maximum number of matching seed wavelets, and the wavelet slopes are used for pre-matching and secondary matching to speed up the matching speed. Then, folded self-adaptive threshold is used to select multiple seed wavelets, and finally the end waveform is predicted and expanded according to the adaptive filter method. The proposed method is used to analyze the non-stationary nonlinear simulation signal and experimental signal, and it is compared with the mirror extension and RBF extension methods. The orthogonality index and similarity index of the EMD results of the extended signal after the proposed method are better than those of the other methods. The results show that the proposed method can better constrain the end effect, and has certain validity, accuracy and stability in solving the end effect problem.

Key words：empirical mode decomposition(EMD); end effect; sequential similarity detection; adaptive filter

Empirical mode decomposition (EMD) is a signal processing algorithm proposed by Huang et al.[1-2]in 1998. It has many significant advantages for processing nonlinear and non-stationary signals and has achieved huge progress and achievements in various fields such as radio signals and medical EEG signals[3-4]. As the theory of the algorithm has not been strictly proved, there are still some problems in the process of data decomposition, one of which is the end effect. In the decomposition process, the upper and lower envelopes are obtained by interpolating signal extremes to form a cubic spline curve, but the endpoints of signal may not be extremes, which diverges envelopes at the end of signal. The presence of divergence will seriously affect the accuracy and effectiveness of the decomposition results, especially when using EMD for time series analysis and fault diagnosis. Scholars have studied several effective methods to constrain the end effect. On the one hand, one kind of method that uses other splines to fit envelops has an inhibitory effect on the end effect, and it is rarely used since its practical performance is worse than using cubic splines[5]. On the other hand, signal extension methods such as mirror extension, waveform matching extension, polynomial fitting prediction, neural network prediction and other methods are quite effective for constraining the end effect[6]. Wang et al.[7]proposed a method combining mirror extension and SVM to constrain end effect. Hao et al.[8]proposed a new method of adding windows function to SVM extension signals to constrain the end effect. Furthermore, the extension algorithm based on waveform matching extends the similar waveform at the end of the signal, and it takes into account the change trend inside and at the end of the signal. It has advantages unmatched by other methods and has become one of the important ways to constrain the end effect of EMD. The traditional waveform matching extension can play an effective role in constraining the end effect, and its improved methods are proposed in the follow-up studies which optimize the algorithm and its accuracy for the end effect problem. Shao et al.[9]applied the waveform matching method based on the distance function to the EMD endpoint extension and achieved good results. Su et al.[10]proposed the gray mean prediction model to extend the original data to reduce the end effect; Xu et al.[11]symbolized the signal local extreme value sequence to extend according to feature matching and reduced the impact of the end effect. However, these algorithms have some shortcomings that largely depend on the accuracy and efficiency of the initial data matching. If there is some deviation between original data, the extended waveform will not match the actual situation. Therefore, this paper proposes an end effect suppression method based on a sequential similarity matching and adaptive filter, and processes the simulation signal and the experimental signal by using this method. The results show that this method can effectively prevent the occurrence of the end effect.

1 EMD and the End Effect

1.1 EMD

EMD is representative in the adaptive time-frequency analysis method. Its main principle is to separate the intrinsic mode function (IMF) from the complex signal. These IMFs need to satisfy the following two conditions: In the whole data set, the number of extreme is equal and the number of zero crossings must either be equal to or differ at most by one; at any point, the mean value of the envelope defined by the local maxima and envelope defined by the local minima is zero[1]. The flow chart is shown in Fig.1.

Fig.1 EMD flow chart

The specific decomposition process is as follows:

1) All extremes of the initial signalx(t) are found, and cubic splines are used to fit the maxima and construct the upper envelope. Similarly, the minima is used to construct the lower envelope, and the meanm(t) of two envelopes is calculated. Then, the IMFh1(t) can be calculated by

h1(t)=x(t)-m(t)

(1)

2) Check whether above IMF conditions are satisfied. If not, letm(t)=h1(t) and repeat step 1)ktimes. On the contrary, an IMF component representing the highest frequency of the initial signal is obtained which is recorded asc1(t).

c1(t)=h1k(t)=h1(k-1)(t)-m1k(t)

(2)

3)x(t) is separated fromc1(t) and the remainderr1(t) is obtained.

r1(t)=x(t)-c1(t)

(3)

4) The steps above are repeatedntimes untilrn(t) meets the given termination condition. The decomposition is completed, andnIMFs and a remainder are obtained. The initial signal can be expressed as

(4)

wherern(t) is the remainder representing the average trend of the signal;ci(t) contains IMF components from high to low frequencies.

1.2 End effect

When the signal is decomposed by an EMD, it is necessary to interpolate the extreme with a cubic spline curve. Since the two endpoints of signal cannot be determined to be extremes, it will make the envelope swing larger near the endpoints. With the decomposition in progress, errors will propagate and accumulate inward from the endpoints, which eventually lead to an inaccurate result[12]. This is the end effect of EMD.

Take the simulation signal as an example to illustrate the end effect. The simulation signal is

(5)

Fig.2 shows the curve obtained by using cubic splines directly to fit the extreme of the simulation signal. Since it is impossible to determine whether the extreme is at the end of the signal, large oscillation occurs at both ends of the signal during fitting and results in serious distortions of the decomposition signal. Therefore, it is necessary to eliminate or reduce the end effect by extending the end point appropriately.

Fig.2 The end effect of EMD

2 Adaptive Similar Waveform Matching Exten-sion

2.1 Traditional waveform matching extension method

The waveform matching extension method is one of the end effect suppression methods, and its key aim is that the change trend at the signal end is also reflected internally, especially for signals with strong regularity. Its specific process mainly includes two parts: 1) Find the waveform inside the signal that has the same trend as the end signal; 2) Translate the best matching waveform to the signal end and extend it. Fig.3 shows the distance-based waveform matching process, and the matching distance of the two wavelets is

(6)

wheres1(i) andsj(i) are the values of thei-th sampling point of the initial and matched wavelet, respectively; andNis the data length of the wavelet. A waveform including at least two extremes in front of wavelet S1 that has minimum matching distance is taken and is translated to the signal end to achieve extension.

Fig.3 Distance-based waveform matching process

The traditional waveform matching extension can suppress the end effect, but there are certain shortcomings as follows:

1) Intercepting meaningless wavelet

As shown in Fig.4, when the wavelets are intercepted at equal lengths and the length of the initial waveform S is not selected properly, meaningless wavelets appear. These wavelets cannot match the initial waveform, in which S1 contains two minimum points and S2 contains two maximum points. Meaningless wavelets waste matching calculation time, and make the matching algorithm inefficient and even lose actual meaning.

Fig.4 Meaningless wavelet interception

2) Matching error caused by discrete data

Since the computer can only process discrete data, a matching error will inevitably occur when the sampling frequency is not a common multiple of all frequency components of the original signal. It also affects the matching accuracy to some extent. Take the simulation signal as an example as follows:

x(t)=cos(100πt)+sin(70πt)

(7)

The sampling frequency is 512 Hz, and the sampling number is 300. The image is shown in Fig.5 and the matching distance is shown in Tab.1. From the detailed window and table, it can be seen that S1 and S3 wavelets should be matched, but S2 wavelet is finally selected due to the difference of discrete data in the matching distance. This is a matching error. In addition, the difference may cause more inaccurate matching errors for random waveform with strong noise.

(a)

Tab.1 Waveform matching distance

2.2 Waveform matching extension based on sequential similarity detection and adaptive filter

Sequential analysis comes from mathematical statistics. Its purpose is to use the samples to make statistical inferences[13-14]. The method does not specify sample number in advance, but takes a small sample and decides whether to continue sampling, so it can reduce the sample number effectively.

The sequential similarity detection algorithm (SSDA) is widely used in signal matching and other fields due to its low computational complexity and high accuracy. The threshold is generally adjusted until the best matching wavelet is selected during matching the signal. The best matching wavelet does not necessarily meet expansion requirements according to the previous descriptions, so this paper selects multiple wavelets that meet the conditions by adjusting the threshold to extend waveform. The selection and adjustment of the cut-off threshold plays a very important role in wavelet selection. The usual cut-off threshold uses the matching distance mentioned above and the amplitude of the matched wavelet is not considered, so it is difficult to intuitively reflect the matching accuracy of the matching wave, which is not conducive to threshold selection. Therefore, in this paper, the matching distance is divided by the square of the initial wavelet extreme difference to standardize the matching distance. In addition, the variance of the difference between matching wavelets is calculated to prevent sudden changes in the waveform. The method of adjusting threshold by the folding mode can speed up the matching rate.

The processed multiple wavelets are sorted in time series, and then the adaptive filter method[15-16]is used to extend the endpoints. The basic prediction formula is

(8)

(9)

wherei=1,2,…,N;t=N,N+1,…,n;nis the number of sequence data;wiis thei-th weight before adjustment;w′iis thei-th weight after adjustment;kis the learning constant;ei+1is the prediction error of the (t+1)-th wavelet.kis generally set to be 1/Nto indicate the rate of weight adjustment.

The advantages of sequential similarity detection and adaptive filtering are effectively combined to ensure the accuracy and speed of the new matching extension algorithm that can complete the endpoint extension successfully.

The specific steps are as follows:

1) The original wavelet at the end point is determined asX1. The length of this wavelet isK=s+v, wheresis the length of the wavelet containing the first maxima and minima, and initiallyv=1. The ordinates corresponding to the maximum and minimum ofX1are denoted asMandm, respectively, and the slope of the straight line constructed by these two points is denoted aski.

2) The signalX1is divided intopsegments according to the length ofK, and the slopes constructed by their extreme are recorded aski,i=1,2,…,p, respectively. If |1-ki/k1|≤0.5, the wavelet is recorded in the matching wavelet libraryY.

3) The initial cut-off threshold is set to beT1.

5) If the number of seed wavelets,N<1 selected in the above steps, makeT2=2T1and repeat step 4) until several seed wavelets are selected.

6) Makev=1,2,…,n,nis taken as the length of next wave containing two extremes afterX1according to the actual situation. Repeat steps 1) to 5) and take the group of wavelets with the largest numberNas the final seed wavelet group.

7) IfNis less than the set number of weights, the wavelet with the smallest error is translated in the wavelet group for extension. On the contrary, takeqsampling points of each wavelet front of the seed wavelet group as the extension wave, arrange the extended wavelet groups in chronological order as

(10)

whereyNKis the ordinate value of theK-th sampling point of theN-th seed wavelet.

Each column ofZis predicted to obtain an extended waveform with the adaptive filter method.

The flow chart of the proposed method in this paper is shown in Fig.6.

Fig.6 Flow chart of the proposed method

3 Simulation Signal Analysis

In order to verify the effectiveness of the method, this paper uses EMD to analyze the nonlinear and non-stationary simulation signal. This signal simulates the superposition state of the variable frequency amplitude modulation signal and the stable periodic signal, and a discontinuous signal is added to make it a non-stationary signal. The signal expression is

(11)

where the sampling frequency is 3 kHz, and the sampling number is 1 500. The components of signal and the composite signal are shown in Fig.7.

(a)

Fig.8 shows each IMF and remainder obtained by direct EMD decomposition of the original signal. The figure shows that if the end effect is not processed, there is some serious deviation at the endpoint and even a false inherent component.

Fig.8 EMD results of original signal

The results of EMD and the Hilbert spectrum of the extended signal by the proposed method are shown in Fig.9 and Fig.10. According to Fig.9, there is no discontinuous signal in the decomposition result due to the modal aliasing effect, and it is mixed into other IMF components. This paper only focuses on the end effect. There is no obvious deviation at the endpoint of each IMF component, and a pseudo component is reduced at the same time. It can be seen from Fig.10 that the frequency at the end point after processing is clearer. Therefore, it indicates that the method in this paper has good suppression for the EMD end effect and is helpful for the subsequent fault diagnosis process.

Fig.9 EMD results after extension

Fig.10 The Hilbert spectrum of the EMD results after extension

In order to further illustrate the effectiveness and accuracy of this method, IMF similarity and orthogonality indices are used to evaluate the performance of various end effect methods[17-18].

1) The IMF similarity is described by the ratio of each effective IMF component of the original signal and extended signal. The similarity is defined as

(12)

② IMF orthogonality is defined as

(13)

TheOvalue is used to measure the orthogonality level of each IMF. The smaller theOvalue, the better the orthogonality of IMF. In general, theOvalue does not exceed 0.06, which means that the end effect suppression of this method is effective[13].

The waveform shown in Fig.11 includes the original signal, mirror extension signal, RBF extension signal, and signal after the extension of the proposed method. The first 50 sampling points of the waveform shown in Fig.11 are part of the original signal. Among them, the mirror extension uses the method in Ref.[19], in which the extreme point closest to the end point is used as the mirror point to fold the signal to obtain the extended signal. The RBF extension uses the method in Ref.[19], and its parameters are set as: The quantity of input neurons is 150; the quantity of output neurons is 50; the target mean square error is 0.001; and the expansion speed is 3.22. It can be seen intuitively from Fig.11 that the mirror extension is directly folded at the mirror point, causing the signal at the near end to be consistent while the signal be at the far end does not match the original signal. The RBF extension can make the extension signal be as close to the original signal as possible. However, the discontinuous signal makes neural network training and prediction more difficult, and the extended waveform is not ideal. The method in this paper can effectively capture similar waveforms and adaptively extend them to the endpoints, which are more consistent with the original signals.

Fig.11 Results after the extension of various methods

The evaluation index values of various methods are shown in Tab.2. It can be seen intuitively that the unextended signal has a redundant IMF component, and its decomposition deviation gradually increases. The better the orthogonality level of the mirror extended signal, the more the third IMF begins to show a larger deviation. The neural network can better constrain the end effect after a short training, and the slight deviation is due to insufficient training time and training accuracy. Adjusting the training samples and training for a longer time can achieve higher accuracy. The proposed method has ideal indices because it directly predicts extended waveform according to the similar original signal to achieve a better control of the end effect, and it is effective and accurate.

Tab.2 Comparative analysis of evaluation indicators

4 Test Signal Analysis

In order to further verify the practicability and effectiveness of the method,this paper uses the experimental device shown in Fig.12 to measure the faulty bearing signal. The bearing damage is processed on the outer ring, inner ring and rotor by EDM, and this paper randomly selects a section of the inner ring fault signal for analysis. The bearing model and experimental parameters are shown in Tab.3. The experimental signal is shown in Fig.13.

Fig.12 Experimental device

Tab.3 Bearing test parameters

Fig.13 Experimental signal

The results of the direct EMD and the EMD extension by the above three methods are shown in Fig.14 to Fig.17, respectively. This paper only shows the first five modal components. Fig.14 shows that the signal via direct EMD has a serious end effect from IMF1 to IMF5. The target mean square error of the RBF extension is set to be 0.01 and other parameters are the same as in the previous section. It can be seen that the extension methods can better suppress the end effect, but IMF5 after mirror extension will perhaps diverge at the end of the signal.

Fig.14 EMD results of unextended original signal

Fig.15 EMD results after mirror extension

Fig.16 EMD results after RBF extension

Fig.17 EMD results after the proposed extension

Two indices are also used to verify the effectiveness of the proposed method, and the comparative analysis results are listed in Tab.4. It can be seen that the EMD results similarity after mirror extension has a large deviation, the EMD results similarity after RBF extension is not stable enough, and the EMD results similarity after the proposed method has more consistent similarity with the original signal and a lower orthogonality index. Research results show that the proposed method is suitable for deterministic, stationary random and non-stationary random signals, and it has a good extension effect and is adaptive, so the proposed method can also provide some help for follow-up research.

Tab.4 Comparative analysis of evaluation indicator

5 Conclusions

1) The EMD end effect is analyzed based on the characteristics and shortcomings of the existing waveform matching and extension methods. An improved EMD end effect suppression method based on the sequential similarity and adaptive filter is proposed.

2) According to the simulation analysis and experimental data analysis, two evaluation indices show that the proposed method has some good effect on suppressing the EMD end effect which has a strong adaptability, accuracy and computational efficiency.

3) As the method needs to be given initial parameters in the adaptive prediction, its versatility is limited to a certain extent, and it needs further improvement for complete irregular signals in the future.

Journal of Southeast University(English Edition)

2021年1期