Short-time prediction for traffic flow based on wavelet de-noising and LSTM model

2021-10-21WANGQingrongLITongweiZHUChangfeng

Journal of Measurement Science and Instrumentation 2021年2期

WANG Qingrong，LI Tongwei，ZHU Changfeng

(1. School of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China； 2. School of Traffic and Transportation,Lanzhou Jiaotong University,Lanzhou 730070,China)

Abstract：Aiming at the problem that some existing traffic flow prediction models are only for a single road segment and the model input data are not pre-processed,a heuristic threshold algorithm is used to de-noise the original traffic flow data after wavelet decomposition.The correlation coefficients of road traffic flow data are calculated and the data compression matrix of road traffic flow is constructed.Data de-noising minimizes the interference of data to the model,while the correlation analysis of road network data realizes the prediction at the road network level.Utilizing the advantages of long short term memory (LSTM)network in time series data processing,the compression matrix is input into the constructed LSTM model for short-term traffic flow prediction.The LSTM-1 and LSTM-2 models were respectively trained by de-noising processed data and original data.Through simulation experiments,different prediction times were set,and the prediction results of the prediction model proposed in this paper were compared with those of other methods.It is found that the accuracy of the LSTM-2 model proposed in this paper increases by 10.278% on average compared with other prediction methods,and the prediction accuracy reaches 95.58%,which proves that the short-term traffic flow prediction method proposed in this paper is efficient.

Key words：short-term traffic flow prediction；deep learning；wavelet denoising；network matrix compression；long short term memory (LSTM)network

0 Introduction

In order to better reflect the real-time management of traffic status,with the continuous development of new technologies,intelligent transportation system (ITS)has been proposed,which is recognized as an effective way to alleviate various traffic problems and improve efficiency of road traffic.The traffic guidance and control system are two core subsystems of the ITS[1].The real-time and accurate short-term traffic flow prediction is the premise and basis of traffic guidance,traffic control and travel route planning.Therefore,the massive traffic flow data under the road network conditions can be effectively utilized to obtain various attributes and characteristics of traffic flow.By using intelligent means to reasonably predict future changes in road traffic flow can provide timely and accurate flow information for travelers.It is the most critical and urgent problem for ITS to optimize vehicle travel path,ease traffic flow and avoid congestion.As one of the key technologies of ITS,short-term traffic flow prediction has become a hot research problem and topic.

By reading relevant literatures,experts and scholars have conducted fruitful research on the following short-term traffic flow prediction models and methods.

1)Statistical theory prediction model.The statistical theory models are mainly divided into linear theoretical models and nonlinear theoretical models.Linear theoretical models include historical average models,time series models and Kalman filter models;nonlinear statistical models include nonparametric regression models and chaotic theoretical models.In recent years,a large number of domestic and foreign scholars have done research on short-term traffic flow prediction,the Kalman filter model is the most widely used model in linear theoretical models.A traffic flow prediction model based on Kalman filter theory was proposed in Ref.[2].In-depth study of the Kalman filter model was conducted in Ref.[3],which used a stochastic adaptive Kalman filter model for traffic flow prediction and achieved good results.The autoregressive integral moving average model (ARIMA)was used in Ref.[4] to measure the highway traffic flow,which is the first use of the time series model in the field of traffic flow prediction.The traffic data presented in Ref.[5] has chaotic characteristics and can be predicted by phase space reconstruction technology.

2)Combined forecasting model.This model mainly combines several kinds of prediction models.Ref.[6] used the Kalman filter model and the artificial neural network model to estimate the short-term traffic flow linearly and nonlinearly.Finally,the fuzzy integrated model was used to integrate the outputs of the two models to obtain the final prediction.the results show that the prediction model of the three combinations is more accurate.A new Bayesian combination forecasting method was proposed in Ref.[7],which analyzed the correlation between historical flows by entropy-based gray correlation,and combined three different prediction models for short-term prediction.The traffic flow prediction verifies that the prediction effect of combined prediction model is significantly better than the single model.

3)Deep learning based prediction model.Deep belief network (DBN)[8],long short term memory network[9],stacked self-encoding (SAE)model[10]can handle large-scale multi-dimensional data,and have the characteristics of high model flexibility,excellent learning ability,good generalization ability and strong predictive power[11-13],which are better than other traditional methods.Scholars have a lot of relevant research results.A convolutional neural network(CNN)model with error feedback was proposed in Ref.[14] to predict traffic flow,and the prediction accuracy is greatly improved compared with the traditional model.Ref.[8] took DBN as the bottom stack structure and made full use of the weight sharing in the deep structure to propose a grouping method based on the weight of the top layer,which achieved good prediction effect.However,the mining of time information of traffic flow data in this paper was not sufficient.Automatic encoder was used to predict the deep architecture model of traffic flow characteristics in Ref.[15],but this method did not consider the impact of traffic data processing on the prediction results.CNN was used in Ref.[16] to extract traffic flow features and input the feature components into the support vector regression (SVR)model for prediction.Although the accuracy rate has been greatly improved,the complexity of the traffic network has not been fully considered.Ref.[17] used KNN algorithm,which filters out theKdetection site data corresponding to the minimum error associated with the target site,and used the data as the input of long short term memory (LSTM)model for prediction.An urban expressway prediction model based on LSTM-RNN in Ref.[18] was trained to identify and enhance the spatiotemporal correlation characteristics,taking into account the accuracy and timeliness.The deep bidirectional LSTM (DBL)model captured the deep features of traffic flows in Ref.[19].Online support vector regression (OL-SVR)was used in Ref.[20] to predict traffic flow under typical and atypical traffic conditions.Artificial neural network (ANN)was used in Ref.[21] to process multidimensional data and the model structure was flexible to conduct short-term traffic flow prediction and achieved a better prediction effect.The neural network with genetic approach was applied to short-term traffic flow prediction in Ref.[22].

This paper firstly considers the abstract road network structure from the perspective of macro road network,transforms the actual road network structure into network topology structure,selects the road segment to analyze the spatio-temporal correlation and solves the short-term traffic flow to predict the correlation between other road segments in the road network.Then,according to the characteristics of large volume and high complexity of traffic flow data,the data is de-noised to eliminate the interference data.The road network matrix compression algorithm is proposed.The optimal compression threshold is determined by calculating the defined compression ratio,which improves the data processing efficiency of the previous data preprocessing stage.Finally,the LSTM model with better performance in the short-term traffic flow prediction field is constructed and predicted.The spatio-temporal analysis,data preprocessing and deep learning method of road network are successfully combined to minimize the interference of various factors on the prediction accuracy.By setting the parameter change and comparing the experiments,it is proved that the method in this paper has a higher prediction efficiency.

1 Methodology

1.1 Wavelet threshold denoising principle

A noisy one-dimensional signal model can be expressed as

ft=yt+et,t=1,2,…,M,

(1)

wheretis an equal time interval;ftis the noisy signal;ytis the original signal;etis Gaussian white noise with varianceσ2and obeyN(0,σ2)distribution;Mis the signal length.The original signal appears as a low frequency signal or a relatively stationary signal and the noise signal appears as a high frequency signal.The purpose of noise cancellation is to suppress the noise portionetin the signalft,thereby recovering the original signalytinft.

1.2 Heuristic threshold denoising algorithm

Step 1:Wavelet decomposition.

Choosing the appropriate wavelet base for the layerjdecomposition,the Mallat pyramid algorithm is given by

(2)

wherexj,kanddj,kare the discrete approximation coefficient and discrete detail coefficient of the signal under the decomposition layer,respectively;h0(m)andh1(m)are the low pass filter coefficient and the high pass filter coefficient,respectively,and

h1(m)=(-1)-mh0(N-m),

(3)

whereNis the filter length.

Step 2:Select a soft threshold for the wavelet coefficients from thelth to theNth layers,and perform threshold quantization processing on the wavelet coefficients of each level by using a soft limiting function.

The soft limiting function is given by

(4)

y1(k)=(vsort(|y(k)|))2,k=0,1,…,n,

(5)

wherevsortrepresents a sorting function,it arranges the value of the formula from small to large.

The formula forλis given by

(6)

(7)

(8)

Heuristic threshold estimation is a combination of fixed threshold and unbiased likelihood estimation.Let the signalx(k)ben,the threshold obtained by the unbiased likelihood estimation isλ1,and the threshold obtained by the fixed threshold method isλ2.Then we have

(9)

(10)

The estimation formula for the heuristic threshold is

(11)

Step 3:Reconstruction of one-dimensional wavelets.

The reconstruction of the one-dimensional signal is performed according to the coefficient of theNth layer of the wavelet decomposition and the high frequency coefficients from the first layer to theith layer after the threshold processing.

The algorithm for signal reconstruction is

(12)

Denoising the traffic flow can be achieved by wavelet decomposition and reconstruction.

1.3 RNN neural network model

The traditional back propagation (BP)neural network cannot process the associated information between the data before and after,and it has the natural defects in processing time series information.So recurrent neural network (RNN)with feedback in the hidden layer is proposed to make RNN have memory function.The structural diagram of a typical RNN is shown as Fig.1.

Fig.1 RNN neural network structure

However,the RNN has a short memory cycle.When the distance between the relevant information position and the predicted information position becomes quite large,it will make the training of the cyclic neural network difficult,and the network will lose the ability to connect the previous information to the current output.Therefore,when faced with long sequence information,as the amount of learning increases or the learning period increases,the gradient disappears or explodes.

1.4 LSTM network model

The hidden layer of this model contains memory blocks,which can store and pass information in a long time.Each memory module is composed of the memory cell and three compound units—input gate,output gate and forget gate.The “gate”structure contains the Sigmoid neural network layer and the point multiplication operation.The output of the Sigmoid layer is between “0”and “1”,where “0”means that no information is allowed to pass,and “1”means that all information is allowed to pass,thereby controlling the gate.Input gate indicates how the input layer information is passed to the memory module of the hidden layer.The forget gate indicates how to retain the history information of the memory module at the current time.The output gate indicates how the memory module information is transmitted.Its internal structure is shown in Fig.2.

Fig.2 LSTM unit structure diagram

Fig.3 LSTM improved RNN structure

The LSTM neural network was proposed in Ref.[23].As shown in Fig.3 above,some improvements have been made to the network hidden layer based on the RNN network,so that LSTM can learn long-term dependency information to effectively avoid the gradient disappearance problem.

2 Construction of prediction model based on LSTM

2.1 Model construction

The LSTM model is constructed according to the structure in Fig.2,and the input gate can be expressed as

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

(21)

(22)

(23)

The model in this paper has five layers including two input layers,two output layers and one intermediate layer.Each layer is superimposed in order and the network between layers is fully connected.The first two hidden layers are LSTM layers.After selecting the features,the pre-processed training set traffic flow feature compression matrix is directly sent from the input layer to the LSTM layer for cyclic calculation,then set the input tensor dimension and output tensor dimension of each LSTM layer separately.The activation function is set to tanh,dropout constraints are added to the hidden layers of the model,so that the information on the input connections of each LSTM network module will be temporarily deactivated with a certain probability in the process of forward activation and backpropagation weight updating.Model optimization function usesadamalgorithm,and the size of the batch is specified.Subsequently,all the extracted features are sent together to the flat vector of the flat layer and the vector is used as the input of the last two layers in the full connection layer.The fully connected dense layer is used as the output layer and the activation function of the layer is set torelu.The output dimension of the full connection layer is 1,the output of the final model is the predicted targetY(t),which is the traffic flow data at the next moment.

2.2 Model training

This paper builds the LSTM network model and BPTT algorithm is used to train the model,the specific training prediction flow chart is shown in Fig.4,where iterator and Epoach respectively represent the iterations times of the current iteration and the total number of rounds of the model iteration.

Fig.4 LSTM prediction flow chart

The above model is trained with two data sets respectively.The model trained with the original traffic flow data is recorded as LSTM-1,the model trained with wavelet denoising data is recorded as LSTM-2.The two trained models are saved to prepare for the post-context comparison experiment.

3 Road network traffic flow data compression method

3.1 Correlation coefficient

The correlation coefficient between the two variablesxandyis defined asR,then

(24)

3.2 Matrix conversion of traffic flow data

Considering the road network in the area as a network mapG,there isG=(Q,E),whereQis the number of nodes in the road network;Eis set of all road segments in the entire network.Suppose there areproad segments in the road network;Nis the time lag of historical traffic flow data,there isE={Si,i=1,2,…,p}.Each road segmentSicontains a continuous time series,where is recorded as matrixqi={z(si,t-N+1),z(si,t-N+2),…,z(si,t)},whereqirepresents the traffic flow of the road segmentSiin the time period of (t-N+1,t).Soz(si,tj)(i=1,…,p;tj=1=t-N+1,t-N+2,…,t)represents the traffic flow of the road segmentSiwithin the time interval of (tj-t0,tj).The traffic flow data of the entire road network constitutes a space-time two-dimensional matrix,recorded as

(25)

According to Eq.(24),the correlation coefficientRof any two sections of the road networkR(i,g)can be obtained as

(26)

3.3 Road network data compression matrix construction process

Due to the large amount of traffic flow experimental data,in order to improve the overall prediction efficiency,correlation coefficientRis used in this paper to construct the compression matrixU.The specific flow chart is shown in Fig.5.

Fig.5 Flow chart of compression matrix construction

4 Experimental results and analysis

4.1 Traffic flow data preprocessing

In order to evaluate the effectiveness of the proposed method,this paper uses the trafffic data from the University of Minnesota Duluth for empirical analysis.The data are collected in real time from traffic flow,occupancy and speed data collected by more than 4 500 loop detectors around Twin Cities Metro highways at interval of 30 min in multiple sections of the road network.Servers at the University of Minnesota Duluth package the collected data into a single zip file and save it to an archive file.In this paper,a real road network area is selected and the data in the above compressed package is used for experimental verification.The selected actual road network structure is shown in Fig.6.

Fig.6 Road network structure

In order to show the experimental sections selected in this paper more clearly,the six roads 35E,35W,94E,169N,694N and 494E are separated in the actual road network diagram(Fig.6)to construct the area as the research object,as shown in Fig.7.

Fig.7 Road topology map selected in road network area

The six roads in Fig.7 are arranged sequentially from left to right,and each road is divided into ten road sections.Each road section is numbered into sixty road sections.The specific division results are shown in Table 1.

Table 1 Road segment number

4.2 Traffic flow data correlation calculation

After the above-mentioned wavelet de-noising process,the time-space correlation analysis is performed on the sixty road segments.The correlation coefficient is calculated according to the traffic flow data on June 10,2016.According to the Eq.(26),the value of the correlation coefficient matrixRis shown in Table 2.

Table 2 Correlation coefficient matrix

Using the correlation coefficient matrixRcalculated in Table 2 and the compression matrix method described in section 3.4,different threshold values are set and the numbered road segments are grouped according to the correlation coefficient.

4.3 Performance evaluation indicators

In order to evaluate the performance of the prediction results,the root mean square error (RMSE),mean absolute percentage error (MAPE)and accuracy are used as evaluation indicators,which are defined as

(27)

(28)

(29)

4.4 Compression matrix construction

The size of the correlation coefficient thresholdαdetermines the selection of the number of segments in the compression matrix,which affects the prediction accuracy of the entire road network.Therefore,for each threshold,the corresponding compression ratio(defined asR)is proposed here.In order to get a suitable compression ratio,αis set.Through many experiments,the relationship betweenαandRis obtained.Under the condition of setting differentα,the optimal correlation coefficient threshold is obtained by analyzing the system running timeT(s).The mathematical expression of compression ratio is

(30)

wherePis the total number of sections;ris the number of groups.

It can be seen from Table 3 that the size of the correlation coefficient thresholdαdetermines the size of the compression ratioR.If the adjustmentαincreases,Rdecreases and the running time of the system increases.Under the condition that the prediction accuracy is controlled within a certain range,when the correlation coefficient threshold is 0.95 (R=11%),the system has the shortest running time.Whenα=0.95,the calculated road segments are grouped as shown in Table 4.

Table 3 Different effects on running time

Table 4 Section grouping table

Takeα=0.84,0.88,0.92,0.96,0.98,0.99 in turn.By calculating the correlation and grouping the road sections,the 60 road sections can be grouped into 3 sections,6 sections,12 sections,30 sections,15 sections and 23 sections.After the above calculation and analysis,it can be known that the correlation coefficient threshold is 0.95,and 60 road segments can be divided into five groups.We randomly select the traffic flow data with road sections numbered 40,22,60,10,44 in the group to form the road network compression matrix.Then the five numbered sections are predicted respectively and the traffic flow of the five sections is used to depict the traffic situation of the road network.

4.5 Prediction results

The training data for each road segment is from June 1,2016 to June 24,2016 (00:00-23:55).The sampling interval of traffic flow data is 5 min,and the time delay is 3,so the number of samples per day is the total number of 285 training samples,the number of samples for 24 d is 6 840,and the total number of test samples for June 25 to June 30 is 1 425.

Road sections of No.22 and No.44 are selected for short-term traffic flow prediction.The test data set is divided into the original traffic flow data and the data after wavelet de-noising.Input the test data sets of two selected sections into the LSTM-1 and LSTM-2 models respectively,so as to verify the prediction efficiency of the LSTM model and the impact of data processing on the prediction accuracy of the model.The following prediction results are shown in Figs.8-11.

Fig.8 Prediction results of road section No.22 using model LSTM-1

Fig.9 Prediction results of road section No.22 using model LSTM-2

Fig.11 Prediction results of road section No.44 using model LSTM-2

In the above Figs.8-11,the traffic flow test set data are divided into 100 time points as the abscissa every 5 min,the traffic flow is used as the ordinate to display the model prediction effect.It can be seen from the prediction results of the four graphs that in both the model LSTM-1 and the model LSTM-2,when the actual traffic flow curve changes,the predicted traffic flow curve has a substantially uniform slope change trend and the prediction results are relatively close.This indicates that the LSTM model established in this paper has good predictive ability.Figs.8 and 9 are traffic flow predictions for the road section No.22.It can be seen from Fig.8 that the prediction curve and the actual curve are not well fitted at the first 20 time points.In Fig.9,the two curves have a perfect fitting rate at all time points.The above two patterns are also shown in Figs.10 and 11 for the road segment No.44.It can be seen that the LSTM-2 model trained by wavelet de-noising processing traffic flow data shows better prediction performance,which proves that the pre-processed traffic flow data used for model training and prediction can improve the accuracy of traffic flow prediction.

4.6 Comparative analysis of prediction results

In order to evaluate the effectiveness of the proposed method,four other models were used for comparison.The six models are SVR,ARIMA,CNN,ANN,LSTM-1 and LSTM-2.Firstly,the prediction results of 5 min traffic flow are used to compare the performance of the model with three prediction error evaluation indexes.The calculation results are shown in Fig.12.

Fig.12 Five-minute prediction performance evaluation chart

Secondly,discuss the prediction performance of the proposed model and other models at different time periods.Then the traffic flows of 10 min,15 min and 20 min with different models are predicted and their prediction performances are compared.The calculation results are shown in Table 5.

Table 5 Comparison of traffic flow prediction with different duration

It can be seen from the above table that under the same network structure,when the prediction time increases from 10 min to 20 min,theφMAPEof LSTM-1 model increases from 9.49% to 12.13%,theφMAPEof LSTM-2 model increases from 6.54% to 7.79%.In other models,the decrease of prediction performance can also be observed.Although the accuracy of all algorithms decreases with the extension of prediction time,the growth rate of errors varies significantly between models.For example,when the prediction time extends from 5 min to 20 min,theφMAPEof road network LSTM-2 model increases by 3.37%.Meanwhile,theφMAPEARIMA,SVR,ANN,CNN and LSTM-1 respectively increases by 10.15%,4.06%,4.82%,5.13% and 4.02%.It can be found from the above data that the LSTM-2 model has the smallest error.With the expansion of the prediction time,the LSTM-2 has the slowest increase in error.According to the above discussion,the method and model proposed in this paper have good prediction accuracy and stable performance in different time intervals.

5 Conclusions

In view of the shortcomings of the current short-term traffic flow prediction model,this paper uses the correlation analysis method of traffic flow data to determine and predict the structure of road network.The original data is divided into two types by wavelet denoising method and marked as unprocessed data and processed data,a road network matrix compression method is proposed to construct the processed data into matrix.Then the data matrix is input into the model,and the LSTM-1 model and LSTM-2 model are trained to test the prediction effect.Based on the analysis of simulation experiment results,some conclusions can be summarized as follows:

1)According to the experimental results,LSTM-1 and LSTM-2 models have better prediction accuracy than the other four prediction methods,reaching 91.89% and 95.58%,respectively.By comparing LSTM-1 and LSTM-2 models,it is found that LSTM-2 is superior to LSTM-1 in all its prediction performance indexes.Therefore,it can be concluded that the model has better prediction efficiency after data processing by wavelet de-noising method is input into the model.

2)By setting different prediction time length and comparison with other model prediction results,it is found that the accuracy of LSTM-2 model prediction reaches 95.58%.Compared to ARIMA,SVR,ANN,CNN and LSTM-1 model,the prediction accuracy of LSTM-2 increases by 13.89%,11.96%,10.52%,8.33% and 3.69%,which indicates that the proposed method and model have good predictions performance.

The theoretical system method proposed in this paper makes contributions to the research of short-term traffic flow prediction,provides solutions to the social problems caused by traffic congestion,and effectively promotes the development of intelligent transportation system.Although the method in the paper has improved the accuracy,how to find a method based on mathematical theory to improve the overall predictive ability of the LSTM model is the focus of the next step.The prediction method for the road network is too monotonous,real-time prediction at the road network level faces real research challenges.

Journal of Measurement Science and Instrumentation

2021年2期