APP下载

Pattern recognition of optimal traffic path based on HMM

2020-11-25ZHAOShuxuWUHongweiLIUChangrong

ZHAO Shu-xu, WU Hong-wei, LIU Chang-rong

(School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China)

Abstract: In order to alleviate urban traffic congestion and provide fast vehicle paths, a hidden Markov model (HMM) based on multi-feature data of urban regional roads is constructed to solve the problems of low recognition rate and poor instability of traditional model algorithms. At first, the HHM is obtained by training. Then according to dynamic planning principle, the traffic states of intersections are obtained by the Viterbi algorithm. Finally, the optimal path is selected based on the obtained traffic states of intersections. The experiment results show that the proposed method is superior to other algorithms in road unobstruction rate and recognition rate under complex road conditions.

Key words: hidden Markov model (HMM) ; Viterbi algorithm; traffic congestion; optimal path

0 Introduction

With the increase of car ownership, traffic congestion and environmental pollution become serious in cities[1]. Therefore, it is necessary for an intelligent traffic system to make full use of the existing road resources and traffic data, which can help to adjust the departure interval of buses and release useful traffic information.

At present, there are two types of methods for selection of optimal traffic path. One is to establish a decision-making model with multi-object function, which has the shortest path between the start point and the end point as well as the shortest driving time. In Ref.[2], the optimal solution of multi-object optimization model was obtained by genetic algorithm. However, it is difficult to avoid real-time traffic congestion. In Ref.[3], according to the distance between an ideal point and multiple self-calibration points, the shortest weight paths of all object points were calculated by Dijkstra algorithm, but the ideal point in actual roads is difficult to define. The other is to transform a multi-path problem into a multi-object solution problem by swarm intelligence optimization. However,, increasing model size gives rise to decrease of algorithm stability, thus the algorithm has poor adaptability to actual road conditions. In Ref.[4], a multi-path selection algorithm based on overlapping penalty was proposed. The obtained path is more suitable for actual conditions and the driving time is shortened, but it is difficult to ensure the vehicle path unobstructed under complex road conditions. The algorithm proposed in Ref.[5] has fast convergence and high search quality in solving the optimal path. However, for complex road conditions, dynamic paths are difficult to determine. Therefore, an improved fuzzy c-means clustering algorithm was proposed to identify road traffic conditions[6]. However, the result of clustering is not in accordance with the actural traffic conditions, accordingly, the optimal path is not in accordance with the actural traffic conditions. Therefore, although the driving time is shortened, road congestion rate is high.

In our work, a method of multi-feature data recognition of regional reads for optimal path selection based on hidden Markov model (HMM) is proposed, in which HMMs are trained by means of historical data to match the optimal path. The obtained optimal path has a slight defect in distance, but it is superior to other model algorithms in performance.

1 Data processing

In our work, the data collected are from Zibo City, Shandong Province. Because large volume of original data and redundant attributes cannot meet the requirements of data modeling, data cleaning is necessary.

After reading the original data, analyzing the GPS data of vehicles and extracting the relevant features, we have found that vehicle ownership is missed seriously, which should be taken as feature data to input the algorithm before constructing the model. In order to improve the trend of the HHM fitting the actual traffic conditions, the average value of vehicle ownership should be increased or decreased to fill the standard deviation, thus the result is closer to the actual value and the distribution is more concentrated[7], as shown in Fig.1.

Fig.1 Distribution diagram of vehicle ownership after being filled

Abnormal value is to detect whether the original data contain unreasonable data because such data will reduce generalization ability of the model[8]. In the process of collecting the original data of Linzi District, due to manual statistics and instrumental measurement, some feature data are deviated from the observation data obviously. By analyzing the original data collected, the abnormal distance data of the intersections are founded, as shown in Fig.2.

Fig.2 Intersections abnormaly detected in Linzi District

It can be seen from Fig.2 that the distance data of the intersections are concentrated in general, but there are abnormal data with obvious right deviation in four intersections, which has a certain negative effect on the traffic state trend of the actual intersections, therefore they should be deleted as disturbance variables.

In the actual modeling process, in order to extract more useful data feature information and improve the deep mining effect of the model, we need to consider the feature factors of vehicles because they affect road congestion of intersections. As shown in Fig.3, the number of cars is relatively larger (72.6%) compared to other vehicles (public bus, 13.2%; other vehicles, 14.2%).

Fig.3 Road occupacy rates of passenger cars, public bus and other vehicles, respectively

It can be seen from Fig.3 that this region has the characteristic attribute of the car due to its high road occupacy rate at the intersection. These data indirectly reflect the strong correlation between the characteristic attribute of the regional vehicles and the traffic state distribution at the regional intersections.

2 Construction of HHM

The traffic flow at the regional intersection shows a wave peak phenomenon in the morning and evening. The collected regional characteristic data and object attribute data of Zibo City can reflect the potential travel pattern of the intersection traffic system in the region by using the regional characteristic data of intersections to fit the macroscopic traffic state HMM of intersections and road networks. Here, the pattern recognition process of the optimal path in the region based on HMM is given. The framework is shown in Fig.4.

Fig.4 Pattern recognition process of optimal path based on HMM

In our work, HMM is used for path identification at the intersection of Huangong Road and Dashun Road in Linzi District of Zibo City. As we know, traffic state at the intersection is variable. Accordingly, in the learning process, the more the traffic state sequences are divided, the closer to the real traffic state the fitting effect of the traffic state based on time period, but this will increase experimental cost. Therefore, according to the road traffic regulations of the People’s Republic of China, the traffic states of the regional intersection are divided into three levels,w1,w2andw3, indicating congestion state, smooth state and slight congestion state, respectively, as illustrated in Fig.5.

Fig.5 Regional traffic state transfer

According to the actual complexity of the intersections, the number of traffic states (N) at the intersection is set to be three as the hidden state. At the same time, the number of data features used in our work is four, and accordingly the number of visible state data features (M) is set to be four. For the visible state length (T), considering the actual data collected in Linzi District of Zibo City,Tis selected as 12 in the process of data cleaning (selected time period is 9:00-15:00).

It is assumed that the traffic state of the intersection starts from the initial state, and the corresponding traffic situation is randomly transferred according to the selected HMM triangular topology, which keeps in the final traffic situation. When the number of traffic states isN, the probability distribution of the initial state can be selected asπ=[1,0…0]T. For the transfer probability matrixAwithN×Nstates, the initial state distribution is

(1)

For the observation probability matrixBwithN×Mstates, four observation states are selected to determine the initial distribution of the observation probability matrix. In other words, regional traffic volume, regional GPS location, resident original-destination and other feature data are used for observation probability. According to the probability calculation of the actual original data features, the initial observation probability matrixB[9-11]in the corresponding state mode is set to be

(2)

After setting the initial values of the above HMM model parameters, the initial values of HMM are determined. In our work, the key is to use the feature data collected in the region to fit the traffic states of the intersection at a certain time in the region so as to improve model accuracy. By use of feature data processed in real time, the traffic states of the entire regional intersections are identified. Then each intersection is used as a node matched by the GPS data in order to show the visible state on the map. Finally, the mode of the traffic state corresponding to the road network is recognized, the optimal path is obtained.

The recognition of the traffic state is based on determination of the model parameters that have been trained. By collecting the feature data of real-time observation state, the corresponding optimal state sequence at each moment is calculated, thus the most optimal connection points form a optimal path. At the same time, the optimal path is the realized by using Viterbi algorithm to dynamically plan the predicted problem based on HMM. According to dynamic programming principle, if the optimal path passes through nodeT*at timeT, all possible paths in the partial paths from this nodeT*to the end point are optimal, too.

First of all, we define two variablesαandφ, and then the maximum probability of all single paths (i1,i2,…,it) with stateiat timeTis defined as

i=1,2,…,N.

(3)

The recursive formula forσis given as

αt+1(i)=

i=1,2,…,N;t=1,2,…,T-1.

(4)

Therefore, the (t-1)th node that is most likely routed in all single regional traffic paths is

i=1,2,…,N.

(5)

The implementation of the above process must be realized by Viterbi algorithm. The pseudo code of the Viterbi algorithm is as follows.

Begin

initialize Path←{ },t←0

Fort←t+1

i←1

Addwjto Path

Untilt=T

Return Path

End

Thus, an HMM for dealing with vehicle space-time data is constructed, which can solve the optimal path well and make useful travel plan for traffic management department and traffic travelers.

3 Experiment and analysis

3.1 Data analysis based on proposed model

In order to test the effect of the proposed HMM on pattern recognition of optimal traffic path, we take five main streets in Linzi District as samples, adopt the characteristic data during the period from 9:00 to 15:00 in one week in the region, select the intersection of Huangong Road and Dashun Road as the test point of the model, and match the appropriate traffic travel route in accordance with the real living data.

Since the HMM model is a probability generation model, according to the relevant road traffic congestion index level, we divide the above-mentioned intersection states in the region into three levels. Considering the strong randomness of traffic flow, there are three traffic states hidden in transportation system. In the training process of HMM, four-dimension feature data are input in the batch mode, and the parameters of the HMM are shown in Fig.6.

Fig.6 Parameters of HMM traffic states

The road traffic state at the intersection in Linzi District, Zibo City, Shandong Province, during six hours is shown in Fig.7. It can be seen that the traffic state is in a non-clear state when it is at around 13:00 in working hours.

Fig.7 Fitted waveform of traffic state

3.2 Comparison with other methods

To better illustrate the advantages of the HMM in identifying regional traffic paths, we compare the feature data, average recognition rate and driving time obtained by our method with those obtained by wolf group algorithm and fuzzy c-means clustering algorithm, and the results are shown in Table 1.

Table 1 Performance comparison of three algorithms

It can be seen that the HMM is superior to the other two methods. Therefore, it can be well applied to the optimal path identification in the region.

The recognition rates of HMM algorithm and c-means clustering algorithm are compared, as shown in Fig.8. It can be seen that the recognition rate of HMM algorithm based on multi-feature data is significantly higher than that of c-means clustering algorithm.

Fig.8 Comparison of recognition rates between HMM algorithm and c-means clustering algorithm

3.3 Experimental visualization

In order to more intuitively illustrate the effect map of the optimal path, the relevant feature data points on the road map in the region are visualized, and the actual road network distribution data are obtained through the Google map interface, as shown in Fig.9.

c

Fig.9 Map route data of Linzi District

In Fig.9, each gray visualized point represents the data infromation of a road node, which is an intersection, the location of traffic light or other situations. In order to show the regional road traffic nodes, we simplify the data type of the road, only retaining the node data of intersection. The visualization of the data trained is shown in Fig.10.

Fig.10 Regional road node data distribution

According to the Viterbi algorithm and the consistency of the traffic state with path of each intersection in time and space, we choose the optimal path composed of two points as the start point, as shown in Fig.11.

Fig.11 Optimal path distribution of regional road nodes

HMM can identify the road conditions in the next time based on time series in the pattern recognition of the optimal traffic path. According to the previous feature data, it can provide better travel arrangements for traffic travelers. The specific road distance, running time and driving speed of the vehicle can well reflect smoothness of the road[21-22]. Furthermore, the traffic conditions in the city are well divided in time and space.

4 Conclusion

Based on the regional characteristic data in the city, this paper presents a method for solving the regional optimal traffic path based on HMM. By selecting the data set collected in the urban area, the hidden state sequence and observation sequence of the HMM are divided. And then, the model parameters are trained by the maximum likelihood estimation method. Finally, the Viterbi algorithm is used to select the optimal path based on the trained HMM. The experimental results show that the path obtained by the method has great advantages in the case of complex road traffic. It provides a reference for alleviating traffic congestion in cities.