APP下载

Support Vector Regression for Bus Travel Time Prediction Using Wavelet Transform

2019-07-25YangLiuYanjieJiKeyuChenandXinyiQi

Yang Liu,Yanjie Ji*,Keyu Chenand Xinyi Qi

(1.School of Transportation,Southeast University,Nanjing 210096,China;2.Guangzhou Urban Planning &Design Survey Research Institute,Guangzhou 510060,China)

Abstract:In order to accurately predict bus travel time,a hybrid model based on combining wavelet transform technique with support vector regression(WT-SVR)model is employed.In this model,wavelet decomposition is used to extract important information of data at different levels and enhances the forecasting ability of the model.After wavelet transform different components are forecasted by their corresponding SVR predictors.The final prediction result is obtained by the summation of the predicted results for each component.The proposed hybrid model is examined by the data of bus route No.550 in Nanjing,China.The performance of WT-SVR model is evaluated by mean absolute error(MAE),mean absolute percent error(MAPE)and relative mean square error(RMSE),and also compared to regular SVR and ANN models.The results show that the prediction method based on wavelet transform and SVR has better tracking ability and dynamic behavior than regular SVR and ANN models.The forecasting performance is remarkably improved to obtain within 6%MAPE for testing section I and 8%MAPE for testing section II,which proves that the suggested approach is feasible and applicable in bus travel time prediction.

Keywords:intelligent transportation;bus travel time prediction;wavelet transform;support vector regression;hybrid model

1 Introduction

Bus travel time prediction is vital component of advanced public transportation system(APTS)and advanced traveler information system(ATIS).With the rapid development of communication and network technology,an accurate and real-time travel time forecast is increasingly important.For bus operation management,it can help optimize bus route planning,stop site and distance between stations selection,and choose appropriate road section to implement bus priority tragedy,which will realize better bus priority on the premise of limited traffic supply.On the other hand,real-time and dynamic bus arrival time forecast released by mobile communication applications can help passengers make more suitable travel plans,which not only reduces the long waiting process,but also improves the service level of public transportation and attracts more passengers.

Previously,various methods have been adopted by researchers to forecastbustraveltime using historical average model[1], time series model[2],statisticalregression model[3]and kalman filter algorithms[4].However,the prediction of bus travel time is very complex and highly nonlinear in nature as it depends upon many influence factors such as ridership,traffic flow,weather and traffic signals in bus system.It is difficult for those predicting methods to consider all of factors,so the prediction quality,in practice,is unsatisfactory.

In the recent decade,machine learning models have better capability to handle nonlinear mapping problems that are complex in nature,particularly in the field of travel time prediction where an artificial neutral network(ANN)has been widely applied.Park and Rilett analyzed the performance of ANN applications in bus travel time modeling[5];Chien et al.[6]put forward two ANN models based on link and bus station respectively,which are applied to bus travel time prediction.It has been shown that ANN model has good applicability in bustraveltime prediction.Further,there is a lot of research proving that ANN modeloutperformshistoricalaverage,statistical regression and kalman filter models in bus travel time prediction[7-8].However, ANN model follows the principle of empirical risk minimization,which has some drawbacks,like local optimal,overfitting or under-fitting problems and generalization ability defects[9-10],which may reduce the effect of artificial neural network application in travel time prediction to a certain degree.

Support vector machine(SVM),which is based on statistical learning theory,is a relatively new classification and regression technique from the artificial intelligence field.It is good at finding the statistical laws under the small sample and has a strong learning ability.Moreover,this technique has better generalization performance and is easy to be balanced between the level of generalization and fitness.Due to the structural risk minimization principle,SVM can effectively overcome the defects of ANN,which has gained attention in the transportation domain.Besides,urban public transport is anon-stationary,timevariant,and stochastic system,therefore using SVM in bus travel time prediction has important significance,which has been found to perform well compared to the other predictors[11-13].

Wavelet transform(WT),which can decompose the original data into various frequency components,has been successfully used in the fields like data analysis and signal processing.The application of wavelet transform provides useful information about sub-series components of originaldata so that forecasting capability of a model can be improved by extracting useful information at different levels.In recent years,wavelet transforms has been applied to a number of research fields such as temperature[14],water resource[15-16],wind energy[17]and share price prediction[18],which combine the wavelet transform to form a hybrid tool in their models.Research findings indicate that the hybrid model can be efficient and effective in improving the accuracy of forecasts and has been gradually adopted in transport domain.In hybrid prediction model,severaltechniques are combined to take advantages of their unique features in data analysis and modeling.In fact,every method has its strong points,for example wavelet transform(WT)has an advantage of frequency decomposition in time domain,while a support vector machine(SVM)is good at handling nonlinear optimization problems.So it is really meaningful to unite those methods in bus travel time prediction domain for the purpose ofimproving theaccuracy ofprediction results[19-21].

In this study,wavelet transform is used to capture the detailed information of bus travel time variation and decompose original data into several components at different frequency.The SVR models for predicting the components from high frequency to low frequency are constructed respectively.The final prediction result is derived from the summation of model outputs for each component.The main purpose of this study are to analyze the performances of applying wavelet transform-support vector regression model into bus travel time prediction and to compare the performances of the WT-SVR models with other widely used models like SVR and ANN models.

2 Theory of the Model

Wavelet transform (WT) has excellent characteristics of multi-resolution analysis.On one hand,the signals can be decomposed into different levels,and the information features of different levels can be displayed,which helps to give a deep insight into the variation of signal.On the other hand,the components of transient abnormal phenomenon entrained in normal signal can be detected,and their components are displayed[22].Compared to traditional artificialneuralnetwork,supportvectormachine method replaces traditional empirical risk with structure risk minimization and solves a quadratic optimization problem with the global optimal solution in theory.Therefore,the application of hybrid wavelet transform-supportvector regression (WT-SVR)model in bus travel time prediction can capture the regularity of bus running behind the seemingly random and improve the prediction accuracy.

2.1 Wavelet Transform

Suppose the function φ(t) ∈ L2(R)and its Fourier transform ψ(ω)satisfies the condition(t and ω are random variables):

Then φ(t)can be called wavelet base or mother wavelet.By dilationsand translationsofmother wavelet,a family ofwaveletfunctionscan be obtained:

where a represents the scale factor and b represents the translation factor.Let a=2jand b=k·2j,discrete wavelet transform(DWT)can be transformed as follow:

where k denotes the shift parameter and j denotes the resolution level.Ifthe value of j is larger, the frequency of wavelet decomposition is lower.

An effective way to apply the wavelet transform is the multi-resolution technique based on scale function and wavelet base function,which extracts the low frequency components and the high frequency components of the series respectively.The process of multi-scales decomposition can be expressed as:

where,V0is original signal;Viis the approximate components of signal,i=1,2,…,n;Wiis the detail components of signal,i=1,2,…,n .

For a given section of a bus route,the bus travel time in this section at time step t can be defined as f(t),and t=1,2,…,n,f(t) ∈ L2(R) .Therefore,the bus travel time series f(t)can be treated as a signal input,which can be decomposed into different frequency bands through wavelet decomposition.The reconstruction expression of f(t)can be obtained by Mallat multi-scales analysis algorithm as follows:

where cj,kis wavelet coefficient and dj,kis scale coefficient; φj,k(t)denotes wavelet base function and ψj,k(t)denotes scale function;Ajand Djare the approximate and detail sequences of original data after reconstruction,respectively.The flow chart of Mallat wavelet decomposition is shown in Fig.1.

Fig.1 Mallat wavelet decomposition

2.2 Support Vector Regression

For the case of regression problems,suppose that given a series of data points,namely{(x1,y1),(x2,y2),…,(xn,yn)}(xiis the input vector;yirelates to the target value;and n is the number of observation).In order to solve nonlinear regression problems,a set of non-linear transfer functions are used to map the input space into high dimension feature space,where theoretically a simple linear regression can be found to approximate a given sample.According to statistical learning theory[23],the linear estimation function of SVR can be formulated as follows:

whereφ(x)denotes a non-linear transfer function in the feature space; ω is weight vector,b is a constant.The coefficients ω and b can be calculated by minimizing the regularized risk function:

After optimizing above equation by Lagrange function and condition,a non-linear regression function can be given as:

where αiandare two Lagrange multipliers.k(xi,is a kernel function which describes the inner products in the high dimension feature space.By using kernel functions,all calculation processes can be finished directly in the input space without mapping into the high dimension feature space.The structure of SVR is shown in Fig.2.

Fig.2 The topology structure of SVR

The performance and efficiency of SVM depends greatly on the kernel function,so choosing the kernel function and corresponding parameters properly according to different problems is very important.The common kernel functions are shown in Table 1.

Table 1 Common kernel functions of SVR

3 Model Development in Bus Travel Time Prediction

In this study,a hybrid WT-SVR model is used for predicting bus travel time,which is formed by combining the model of support vector machine with wavelet transform technique.The details of model input and details regarding the wavelet decomposition are discussed briefly in this section.

Considered to the variation of bus running,four input variables and an output variable are used,which are advised by Ji et al.[24].Firstly,bus travel time is non-stationary and fluctuates during a day.Especially at morning and afternoon peak hours,the bus travel times will increase significantly;then,different road segments have different number of intersections,road segment length,traffic conditions,and traffic flow composition.All these differences can result in the changes of bus travel times.Thus,the time of day should be classified into several periods,and road segments should also be considered as input factors in this model.Moreover, bus travel time is easily influenced by many random factors such as traffic flow,ridership,weather,stops delay and traffic signals delay,but it is very difficult to estimate the traffic condition of road segments by obtaining this information in real time.Based on the research of Yu[25],this paper chooses the latest bus travel time of the predicted section and the latest bus travel time of the previous section to represent the current traffic condition of predicted section and the running status of the bus,assuming that the latest travel time can be obtained by bus information system in real time.Therefore,four input variables include time of day(x1),road segment(x2),the latest bus travel time of at predicted section(x3)and the latest bus travel time of current bus at preceding segment(x4);y denotes output vector,which represents the bus travel times from stop i to stop j.While a bus reaches the stop i,the latest travel time from stop i- 1 to stop i will be updated.

For a bus route,the bus travel time series at each segment can be decomposed into sub-series component(approximation components A's and detail components D's) using wavelet multi-scales decomposition beforehand.The input data such as the latest bus travel time in current section and the latest bus travel time in preceding section can be obtained by the corresponding bus travel time sub-series.The sub-series(A's and D's)components of future travel time at predicted section are predicted by different SVR models separately.Finally,the prediction result is the aggregation of each model outputs.

With respect to the model parameters,radial basis function (RBF) is selected as the kernel function,which is able to fit high-dimension data with a few hyperparametersthusreducing the complexity of prediction model.The definition of RBF kernel function can be expressed as:

k(xi,x)=exp(- γ||x - xi||2),γ > 0(10)

When RBF kernel is used,three SVR parameters including penalty parameter C and kernel function's parameter γ and tube size ε are considered.The general accuracy of prediction depends on a proper setting of these parameters,and the best combination of parameters(C ,γ and ε )can be determined by the methods such as k-fold cross validation(CV),genetic algorithm (GA),and particle swarm optimization(PSO).For simplicity,five-fold cross validation is chosen to optimize the parameters of all SVR models.

The structure of prediction model is shown in Fig.3,of which details are demonstrated as follows.

Fig.3 Diagram of bus travel time prediction model based on wavelet transform and SVR

1)The bus route under study is separated into k segments according to the bus stops.For the convenience of this study,the time of day variable is classified into peak hours(7 ∶00-9 ∶00 a.m.and 17 ∶00-19 ∶00 p.m.)and off-peak hours.

2)The original bus travel time data is decomposed into a set of various subsequences using wavelet multiresolution technique and single branch reconstruction method.

3)Afterwavelettransform,each sub-series components are learned and trained separately by supportvectorregression models.The parameters including penalty parameterC,kernelfunction's parameter γ and tube size ε are optimized by crossvalidation and grid search approach.

4)The final predicting results is obtained by the combination of prediction results from all SVR models,which can be expressed as

where f(*) denotesnon-linearmapping function trained by SVR;D denotes detail components of predict value and A denotes approximation components of predict value;n is decomposition level.

5)Performance measures are conducted by comparing the final forecasting value with ANN and SVR prediction results.

4 Numerical Test

4.1 Study Area and Data

To evaluate the applicability of proposed WTSVR model for bus travel time prediction,a southeastbound corridor on Daqiao Rd.and a northwestbound from Jianning Rd.to Rehe Rd.of bus No.550 in Nanjing,China were selected,as experimental route sections.The route of Bus No.550 is 10.2 km length and has 27 bus stops in the upstream direction,which starts from Taifeng Road terminus to Mochou Lake Park terminus.The bus headway varies from about 10 min in peak hours and about 15 min in offpeak hours.The study region of bus No.550 in this paper starts from Qiaobei Coach station to Agricultural Trade Center stops,which is divided into two sections as shown in Fig.4.

Fig.4 Layout of study area of bus No.550

a)Section I: from QiaobeiCoach Station to Daqiao Hotel stop.

b)Section II: from Daqiao Hotel stop to Agricultural Trade Center stop.

The buses on this route are equipped with the GPS and AVL devices that can obtain the real-time travel time information.The bus travel time data was collected from November 2,2015 to November 10,2015 in weekdays during the bus operation time(05∶10 am -21∶10 pm).Afterpreprocessing of collected data,a total of 560 sets of data are valid,and each set of data contains the travel time of a bus through a road segment.All the bus travel time data sets are divided into two parts for training and testing.The bus travel time observations from the six weekdays from November 2,2015 to November 9,2015 are set as training set,and the data of November 10,2015 is set as the testing set.To avoid numerical difficulties,normalization ofthe samplesisconducted before modeling as follows:

where,xidenotes the ith value of the input or output data set X={x1,x2,…,xn} .

4.2 Model Identification

4.2.1 WT-SVR model

The history and real-time bus travel time data series are decomposed into several components by wavelet transform at different levels,and each subseries componentispredicted by differentSVR models.The decomposed level is themost key parameter in wavelet transform.If the decomposed level is too low,high-frequency noise remains in the low-frequency components,which will directly affect the prediction accuracy of low-frequency components;but when the level is too large,the complexity and training time of the model will be increased.Thus,in this study 'db3'function is selected as the mother wavelet and decomposed level is three,according to the requirement ofmulti-scale decomposition and single branch reconstruction.All levels components received by decomposition are forecasted respectively by SVR models.At last,the future bus travel time is equal to the summation of prediction results of each component.During the process,RBF is selected as kernel function of SVR models.The best combination of parameters for each SVR is shown in Table 2.

Table 2 Parameters selection of each SVR model

4.2.2 SVR model

For the purpose of investigating the performance ofthe model,the proposed WT-SVR model is compared with the normal SVR and BPANN,which are trained and tested with the same data sets.The normal support vector regression model consists of four model inputs(x1,x2,x3,x4)and one output vector(y)without wavelet decomposition.The best combination of parameters for SVR is C=1,γ =0.062 5 and ε =0.003 125 .

4.2.3 ANN model

The ANN model with the hyperbolic tangent sigmoid transfer function is used in this study,which consists of an input layer,a hidden layer,and an output layer.Different number of neurons in the hidden layer is tested in the back-propagation neural network model in order to identify the suitable welltrained one.By trial and error process the optimal number of neurons in the hidden layer is determined to be 8.The final ANN architecture consists of the same input features as the SVR and the model parameters are optimized by the back propagation algorithm.

4.3 Results and Discussion

In order to evaluate the performance of the prediction,the performance measurement of proposed WT-SVR model is mean absolute percentage error(EMAP),mean absolute error(EMA)and the root mean square error(ERMS).The formula can be expressed as follows:

where,yiis the observed value in future;s the predicted value of yi.The smaller that the value of EMA, EMAPand ERMSare, the better thatthe performance of the prediction is.

The future travel time of bus can be forecasted by WT-SVR model proposed in the above section,and the prediction results of two testing sections in the bus route NO.550 are shown in Fig.5.it can be seen that the proposed hybrid model can capture the underlying dynamics of bus travel time variations and achieve high fitness in both two sections,with the regression coefficient R square 0.754 7 and 0.630 6 respectively.Considering the different traffic conditions on the two testing sections,difference between R square can be easily understood.There are many traffic signal intersections and bus stops in section II,which may cause the travel time of this section to be more fluctuant and non-stationary than section I.

Additionally,traditional BP neural networks and support vector regression model are also experimented with in this paper as comparisons.Fig.6 gives the absolute error of prediction for ANN,SVR and hybrid WT-SVR model in the testing links of the bus No.550.The maximum prediction error of ANN,SVR and WT-SVR are 244,223 and 167 respectively for section I and 331,256 and 140 respectively for section II.It is observed that the hybrid WT-SVR model is able to forecast accurately and gain a lower prediction error in almost all trips when compared to other models.Moreover,Table 2 gives a comparison of EMA,EMAPand ERMSobtained by the WT-SVR,SVR,and ANN models for two testing sections.In comparison with single SVR model,the proposed hybrid model gives a decrease in EMA,EMAPand ERMSvalues of 15 seconds,2%and 20 seconds respectively for section I and 18 seconds,2.5%and 23 seconds for section II.Similarlywhen compared toBPANN model,EMA,EMAPand ERMSvalues for WT-SVR are lowered by 26 seconds,4% and 30 seconds respectively for section I and 31 seconds,4.5%and 45 seconds for section II.According to Lewis[26],a EMAPvalue of less than 10%can be considered quite accurate.As shown in Table 3,the EMAPvalues of the two reference models constructed in this paper are close to 10%or even greater than 10%indicating that their performance is between"more accurate"and"accurately accurate".However,the EMAPvalues of the WT-SVR shows that its predictive performance is"pretty accurate".It shows that the prediction results of the model constructed in this paper are more accurate and reliable,which is feasible and effective in bus travel-time prediction.For the arrival time forecast of passengers issued to passengers,the value of the information depends heavily on the reliability of the forecast results,reducing the prediction error can prevent passengers from missing the bus due to wrong information and improve the availability of information.

Fig.5 Prediction results of WT-SVR in two testing sections

Fig.6 Prediction error of three models in testing sections

Table 3 Comparison of WT-SVR with ANN and SVR models

5 Conclusions

In this paper,the applicability of a hybrid WTSVR model has been investigated for predicting the bus travel time of the route No.550 in Nanjing,China.TheWT-SVR modelwasdevelopedby integrating wavelet transform technique with support vector regression model.In the developed model,the original traveltime data were decomposed into approximate components and detail components by wavelet transform,and SVR model was constructed for each components of future travel time.The model was tested using four input variables including time of day,road segment,and the latest travel time of previous section as well as the latest travel time in predicted section,which isalso compared with regular SVR and ANN model with the same dataset.From the results,it was determined that bus travel time prediction based on the wavelet SVR provided higher accuracy when compared to regular SVR and ANN models,as the wavelet transform can capture travel time variations in different scale and thus enhances the forecasting ability ofSVR model.Therefore,the proposed model can greatly improve the prediction performance of bus travel times,which would contribute to the increase of the service level and predictive reliability.