基于城市交通大数据的车辆类别挖掘及应用分析
2019-08-01纪丽娜陈凯于彦伟宋鹏王淑莹王成锐
纪丽娜 陈凯 于彦伟 宋鹏 王淑莹 王成锐
摘 要:实时城市交通监控已成为现代城市管理的一个重要组成部分,视频监控采集的交通大数据在城市管理和交通控制方面得到了越来越多的应用;然而,全城范围内庞大的监控交通大数据还鲜少用于城市交通及城市计算研究。在一个省会城市全城范围内的监控交通大数据上展开了车辆类别挖掘及应用分析研究。首先,定义了周期性私家车、类出租车和公共通勤车三种对城市交通具有重要影响的车辆类别,将车辆类别定义与频繁序列模式挖掘算法相结合提出了相应的挖掘方法。在济南市一周1704个视频监测点,1.2亿次车辆记录数据上,验证了所提定义及挖掘方法的有效性;其次,以4个居民小区为例挖掘分析了居民出行的交通方式及与周围兴趣点(POI)分布关系,此外,还探索了城市交通大数据与POI相结合在城市规划、需求预测和偏好推荐方面的应用潜能。
关键词:数据挖掘;交通大数据;车辆类别;交通方式;兴趣点
中图分类号:TP274
文献标志码:A
Abstract: Realtime urban traffic monitoring has become an important part of modern urban management, and traffic big data collected by video monitoring is wildly applied to urban management and traffic control. However, such huge citywide monitoring traffic big data is rarely used for urban traffic and urban computing research. The vehicle type mining and application analysis were implemented on the citywide monitoring traffic big data of a provincial capital city. Firstly, three types of vehicles with important influence on urban traffic: periodic private car, taxi and public commuter bus were defined. And the corresponding mining method for each type of vehicles was proposed. Experiments on 120 million vehicle records collected from 1704 video monitoring points in Jinan demonstrated the effectiveness of the proposed definitions and mining methods. Secondly, with four communities as examples, the residents traffic modes and the relationships between the modes and the distribution of surrounding Points of Interest (POI) were mined and analyzed. Moreover, the potential applications of the urban traffic big data incorporated with POI in urban planning, demand forecasting and preference recommendation were explored.
英文关键词Key words: data mining; traffic big data; vehicle type; traffic mode; Point of Interest (POI)
0 引言
实时交通监控是现代城市管理中一项重要任务,它有助于理解城市范围内行驶车辆、人员、公共交通的实时运行状态。这对智能交通系统、公共安全、交通调度与控制、城市计算等各类城市应用具有重要价值[1]。近年来,视频监控被广泛应用于城市交通管理,尤其是在我国快速城镇化建设进程中,各大小城市基本完成了对主干道路的视频交通监控部署。一般情况下,视频监控部署在城市的重要交通路口,如图1所示,在进入路口的每个方向上,都有一组高清摄像头部署在一条水平横杠上,用于监测进入路口的每个车道上的行驶车辆。高清摄像头结合主控机以及道路地面虚拟线圈或地埋线圈实现对通过车辆的检测与抓拍。随着人工智能技术的发展,现有的交通监控系统不仅实现了通过车辆的监测与追踪,还可有效检测车辆速度、行驶方向、识别车牌号码、车辆类型、车辆颜色、车辆品牌等丰富的外围信息。基于这些监测数据,很多交通违规行为可被自动识别而无需人员干涉,例如闯红灯、超速驾驶等。交通堵塞或交通事故也可在视频监控中被实时发现,进而用于疏导行人或车辆的行驶路线以防止交通状况的进一步恶化。此外,视频监控道路上的车流量很容易被统计出来,这些信息对于交通拥堵预测、城市规划、交通控制、甚至空气污染评估[2]等各类应用研究至关重要。
在国内外,已有大量城市交通大数据研究的相关工作[3-5],也有多个真实的城市车辆轨迹数据采集系统,例如:微软亚洲研究院的TDrive项目[6-7]在北京采集了3万多辆出租车三个月的全球定位系统(Global Positioning System, GPS)轨迹数据;葡萄牙波尔图采集了442辆出租车在2011年8月至2012年4月共9个月的车辆轨迹数据[8-9];美国纽约和芝加哥公开了每年所有出租車辆每次载客的起始位置数据[10]。最近,国内网约车行业,如滴滴出行,也对出租车或网约车等城市交通数据展开了研究分析[11], 但大多数城市交通数据及相关研究都是基于出租车数据展开,而出租车数据仅是城市交通数据中的一小部分,并且是对全城交通状况的一个偏差采样,缺少对全城范围内交通特征的体现[12], 这是由于出租车往往倾向于避开交通拥堵路段和高峰拥堵时间[13]。
最近,在贵阳包含155条道路的交通车流量数据被采集,该数据采集方式采用地埋线圈方式,仅能获取到通过每条道路的车辆数量,相比视频监控交通数据,该采集数据不仅数据规模较小,还缺少大量丰富的外围信息。文献[14]虽然使用了北京1040个摄像头产生的车牌识别数据,但也仅用于发现车流数据中车辆伴随模式信息。
在我国城市视频监控交通系统中,主干道和重要交通路口基本都已经被覆盖,例如,在济南,有近2000多组高清摄像头监控部署在1014个交通路口,覆盖了2010条道路。每天监测到上百万车辆的行驶路线。然而,如此庞大的监控系统以及海量的全城交通车辆数据却鲜少用于城市交通及城市计算相关研究。
本文在济南市2016年8月收集的一周的全城视频监控交通数据上进行了挖掘分析,该数据包括了1亿多条车辆记录和400多万辆车。
首先,研究了全城范围内交通车辆的类别,定义了周期性私家车、类出租车、公共通勤车三类对城市交通具有重要影响的车辆类别。根据定义,给出了三种车辆类别的挖掘方法,并对挖掘结果进行了验证与分析。根据挖掘结果,分析三类车辆类别对高峰期城市交通的影响,以及车辆类别挖掘对提升智能交通系统的作用; 其次,结合兴趣点(Point of Interest, POI),以居民小区为例,在城市交通大数据上,通过案例挖掘分析居民出行的交通方式,以及与周围POI分布的关系,探索了城市交通大数据与POI相结合在城市规划、需求预测、偏好推荐方面的应用潜能; 最后总结了全文,并对下一步工作进行了展望。
4 结语
本文完成了对济南市全城范围内交通大数据的挖掘分析,首先,定义了城市交通中具有重要影响的周期性私家车、类出租车、公共通勤车三种车輛类别,并在真实数据上进行了挖掘分析与验证,挖掘结果验证了所定义模型及挖掘算法的有效性。然后,以居民小区为例,分析了几个案例小区居民的出行交通方式,以及与附近POI的关系。最后,探索了视频监控交通大数据与POI深度结合可能具有重要研究价值的潜在应用方向。
下一步,将在城市交通大数据的语义匹配方面展开深入研究,例如实现居住小区的精确匹配、目的地POI匹配、相关活动匹配等。此外,还计划对全城范围内的城市交通状况(例如,交通流量与速度)的推理与预测、车辆路线的目的地预测展开研究。
参考文献 (References)
[1] ZHENG Y, CAPRA L, WOLFSON O, et al. Urban computing: concepts, methodologies, and applications[J]. ACM Transactions on Intelligent Systems & Technology, 2014, 5(3):1-55.
[2] SHANG J, ZHENG Y, TONG W, et al. Inferring gas consumption and pollution emission of vehicles throughout a city[C]// Proceedings of the 2014 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 1027-1036.
[3] ZHENG Y. Trajectory data mining: an overview[J]. ACM Transactions on Intelligent Systems and Technology, 2015, 6(3): Article No. 29.
[4] 高强,张凤荔,王瑞锦,等.轨迹大数据:数据处理关键技术研究综述[J]. 软件学报, 2017,28(4):959-992. (GAO Q, ZHANG F Z, WANG R J, et al. Trajectory big data: a review of key technologies in data processing[J]. Journal of Software,2017, 28(4):959-992.)
[5] 毛嘉莉,金澈清,章志刚,等.轨迹大数据异常检测:研究进展及系统框架[J].软件学报,2017,28(1):17-34.(MAO J L, JIN C Q, ZHANG Z G, et al. Trajectory big data: a review of key technologies in data processing[J]. Journal of Software, 2017, 28(1):17-34.)
[6] YUAN J, ZHENG Y, ZHANG C, et al. Tdrive: driving directions based on taxi trajectories[C]// Proceedings of the 2010 ACM SIGSPATIAL Conference on Advances in Geographical Information Systems. New York: ACM, 2010:99-108.
[7] YUAN J, ZHENG Y, XIE X, et al. Tdrive: enhancing driving directions with taxi drivers intelligence[J]. IEEE Transactions on Knowledge & Data Engineering, 2013, 25(1):220-232.
[8] MOREIRAMATIAS L, GAMA J, FERREIRA M, et al. Predicting taxipassenger demand using streaming data[J]. IEEE Transactions on Intelligent Transportation Systems, 2013, 14(3):1393-1402.
[9] FERREIRA M, DAMAS L. Timeevolving OD matrix estimation using highspeed GPS data streams[J]. Expert Systems with Applications, 2016, 44(C):275-288.
[10] YAO H, TANG X, WEI H, et al. Modeling spatialtemporal dynamics for traffic prediction[J/OL]. arXiv Preprint, 2018, 2018: arXiv: 1803.01254 [2018-12-03]. https://arxiv.org/abs/1803.01254.
[11] YAO H, WU F, KE J, et al. Deep multiview spatialtemporal network for taxi demand prediction[J/OL]. arXiv Preprint, 2018, 2018: arXiv: 1802.08714 [2018-12-03]. https://arxiv.org/abs/1802.08714.
[12] ZHAN X, ZHENG Y, YI X, et al. Citywide traffic volume estimation using trajectory data[J]. IEEE Transactions on Knowledge & Data Engineering, 2017, 29(2):272-285.
[13] MENG C, YI X, SU L, et al. Citywide traffic volume inference with loop detector data and taxi trajectories[C]// Proceedings of the 2017 ACM SIGSPATIAL Conference on Advances in Geographical Information Systems. New York: ACM, 2017: 1-10.
[14] 朱美玲,劉晨,王雄斌,等.基于车牌识别流数据的车辆伴随模式发现方法[J].软件学报,2017,28(6):1498-1515. (ZHU M L, LIU C, WANG X B, et al. Vehicle accompanying pattern discovery method based on license plate recognition flow data[J]. Journal of Software,2017, 28(6):1498-1515.)
[15] PEI J, HAN J, MORTAZAVIASL B, et al. Mining sequential patterns by patterngrowth: the prefix span approach[J]. IEEE Transactions on Knowledge & Data Engineering, 2004, 16(11):1424-1440.
[16] AYRES J. Sequential pattern mining using a bitmap representation[C]// Proceedings of the 2002 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2002:429-435.
[17] ZAKI M J. SPADE: an efficient algorithm for mining frequent sequences[J]. Machine Learning, 2001, 42(1/2):31-60.
[18] WANG J, HAN J, LI C. Frequent closed sequence mining without candidate maintenance[J]. IEEE Transactions on Knowledge Data Engineering, 2007, 19(8):1042-1056.
[19] GOMARIZ A, CAMPOS M, MARIN R, et al. ClaSP: an efficient algorithm for mining frequent closed sequences[C]// Proceedings of the 2013 PacificAsia Conference on Knowledge Discovery and Data Mining. Berlin: Springer, 2013:50-61.
[20] FOURNIERVIGER P, WU C W, GOMARIZ A, et al. VMSP: efficient vertical mining of maximal sequential patterns[C]// Proceedings of the 2014 Canadian Conference on Artificial Intelligence. Berlin: Springer, 2014: 83-94.
[21] FOURNIERVIGER P, WU C W, TSENG V S. Mining maximal sequential patterns without candidate maintenance[C]// Proceedings of the 2013 Advanced Data Mining and Applications. Berlin: Springer, 2013:169-180.
[22] 济南市政府门户网站.济南将新增500辆出租车[Z/OL]. [2018-12-03]. http://www.jinan.gov.cn/art/2014/5/24/art_1862_216217.html. (Jinan City Government Portal. Jinan will add 500 taxis [Z/OL]. [2018-12-03]. http://www.jinan.gov.cn/art/2014/5/24/art_1862_216217.html.)
[23] 濟南时报.济南年内要增500辆出租车近期开听证会听民意[N/OL]. [2018-12-03]. http://www.sdnews.com.cn/sd/jinan/201307/t20130725_1292174.htm. (Jinan Times. Jinan will increase 500 taxis during the year recent hearings to hear public opinion[N/OL]. [2018-12-03]. http://www.sdnews.com.cn/sd/jinan/201307/t20130725_1292174.htm.)
[24] 秦政,王晓芳.网约车注册司机已20万人超半数驾驶员没济南户籍[Z/OL].[2018-12-03].http://news.e23.cn/jnnews/20161025/2016A2500027.html. (QIN Z, WANG X F. The registered driver of the network car has 200,000. More than half of the drivers have no Jinan household registration[Z/OL]. [2018-12-03]. http://news.e23.cn/jnnews/20161025/2016A2500027.html.)
[25] ZHANG J, ZHENG Y, QI D. Deep spatiotemporal residual networks for citywide crowd flows prediction[J/OL]. arXiv Preprint, 2016, 2016: arXiv: 1610.00081 [2018-12-03]. https://arxiv.org/abs/1610.00081.
[26] XIE M, YIN H, WANG H, et al. Learning graphbased POI embedding for locationbased recommendation[C]// Proceedings of the 2016 ACM International on Conference on Information and Knowledge Management. New York: ACM, 2016:15-24.
[27] WANG W, YIN H, CHEN L, et al. STSAGE: a spatialtemporal sparse additive generative model for spatial item recommendation[J]. ACM Transactions on Intelligent Systems and Technology, 2017, 8(3): Article No. 48.