APP下载

Secure Model to Generate Path Map for Vehicles in Unusual Road Incidents Using Association Rule Based Mining in VANET

2018-07-27ArunMalikBabitaPandeyandChiaChunWu

Arun Malik, Babita Pandey, and Chia-Chun Wu

Abstract—Logical behavioral arrangements are a class of conventional arrangements to illustrate the happening of incidents in an appropriate and structured approach in vehicular ad hoc network (VANET). These incidents are characterized as a list of path segments that are passed through by the vehicles for the duration of their journeys from a pre-decided local source to a local destination in a structured manner. A set of proper description illustrating the paths traversed by the vehicles as logical behavioral arrangements is described in this paper. The data gathering scheme based on secure authentication to gather the data from the vehicles is proposed in this paper. This proposed data gathering scheme based on secure authentication is compared with the existing data gathering schemes by using veins framework and the results of analysis reflect that the proposed scheme outperforms among others.The data collected from the vehicles by the proposed data gathering scheme is stored at distributed road side units (RSUs). From these collected paths, the common and frequent paths opted by the vehicles in a certain region are determined by using frequent arrangement mining approach. An estimation model is used to decide the next path and the whole path map opted by the vehicles in unusual situations like accident, jams, or a particular time of day. The proposed scheme will help the society in reducing the waiting time in vent of emergency or normal working days.

1. Introduction

The vehicular ad hoc network (VANET) is an artistic branch of the conventional mobile ad hoc network(MANET)[1],[2]. VANET provides wireless communication among vehicles and between vehicles and road side units(RSUs). The communication that vehicles exchange information among each other is acknowledged as the vehicleto-vehicle (V2V) communication whereas the communication that vehicles exchange information with RSUs is acknowledged as the vehicle-to-road side unit (V2R)communication. Vehicles are outfitted with some gadget called on board units (OBUs) that can establish communications with other vehicles and with RSUs using dedicated short range communication (DSRC) in VANET[3],[4].

This communication aims to advance together comfort and safety on path. VANET reveals various specific features like high mobility, fast dynamic network topology, frequent dissection, etc. During the last few years the anticipation of research society has remarkably improved in VANET[5].Researchers and industry fundamentally acknowledge that VANET can remarkably enhance the traffic safety and path effectiveness and decrease ecological force[6].

To expect the routes opted by the vehicles during their journeys is one of the primary areas that has been tracked and studied by the researchers in VANET[7],[8]. Gaining advance information of a vehicle’s expected path capitulates many benefits. Calculation of jamming level in specific geographic areas at a particular time of the day is one of these benefits.Also, barricades and stored information can be positioned by the policemen deliberately and efficiently in chase of an escapee by making use of this information.

Logical behavioral arrangements may be utilized to order an estimation of next expected vehicle’s route in VANET. The collection of sequential mobility trace data is essential to predict the routes by the use of logical behavioral arrangements. Therefore, there is a desire for collecting behaviors of the moving vehicles as a proper sequence of path segments traversed by the vehicles during their journeys. In this research work, the data gathering scheme based on secure authentication is used to collect the data of the path segments traversed by the vehicles during their journeys. Depending on this data gathering scheme,series of path segments traversed by the vehicles from a particular local source to a particular local destination will be provided to the distributed RSUs, positioned at different locations on a geographical plot. RSUs gather these data and create a database of all the path segments traversed by the vehicles. After this, frequent data mining approach is applied on this database collected at individual RSU to generate local frequent arrangements in a certain region that exists within a convinced threshold.

The primary objectives of this paper are (a) to collect the data of the path segments traversed by the vehicles by the proposed data gathering scheme based on secure authentication and to create a database of past logical behavioral arrangements traversed by the moving vehicles, (b) to compare the proposed data gathering scheme with the schemes mentioned in [9]-[11],and (c) to utilize frequent arrangement data mining approach for developing drive report for the vehicles so that it can be followed in a real time manner. This generated report may be utilized to fetch a vision for the future path prediction for a particular moving vehicle.

The rest of the paper is prepared as: Section 2 evaluates associated work that covers path forecasts, position changes,and data mining techniques. To define logical behavioral arrangements in the framework of VANET, a few definitions are described in Section 3. After this, Section 4 describes the proposed data gathering scheme based on secure authentication to collect data of the path segments traversed by the vehicles.The performance of the proposed data gathering scheme is evaluated by an extensive set of simulations with their results in Section 5. From the collected possible paths traversed by the vehicles, the common and frequent paths opted by the vehicles in a certain region are determined by using frequent arrangement mining approach in Section 6. Section 7 concludes the paper finally.

2. Related Work

From the past few years, various government transportation organizations, employers, and researchers have been paying concentration in the paths opted by the vehicles during their tours. The authors in [12] and [13] have projected the exploit of wireless fixed asynchronous transfer mode (ATM) based network units to trace motion of moving vehicles all the way through a geographical area. Routes traversed by the vehicles during their trips are described in the series of ATM units.

The authors in [14] have described a vehicle tracking scheme to make available precise data on directional traffic calculations at junctions. The mined count ups are provided to guess a source destination tour table that is essential data for traffic collision study as well as transportation planning. In [15]an automatic vehicle categorization and tracking technique has been proposed to estimate the vehicular movement traffic parameters at signalized intersections. This technique has a good talent to categorize the identified vehicles and then parameters of vehicle movement are calculated at the intersection area. In [16], a nonlinear movement model using the shape model and projection model has been proposed to detect and track the vehicle in a nonlinear motion. The precisely calculated results of location and velocity of the vehicle are given by the speed, posture, and the last location of the vehicle.

Various types of applications have been introduced in terms of data mining for variety of purposes. In [17], identification of malicious and faulty vehicles to prevent erroneous or unsafe data was done by using the association rule based data mining technique. In [18], an assistance system using liberal context aware driver for VANET has been proposed. Association rules are used by this proposed system to obtain semantic information that will help to prevent the happening of accidents on road and reduce the number of traffic losses. In [19], a vehicular data platform and two data mining models have been proposed. In the vehicular data platform, vehicle information such as safeguard information can be smartly attached to vehicle’s driver description. Two data models are implemented for natural language processing.

Route and mobility forecasting have been studied and examined in numerous researches. As in [20] to attain a proficient multi-hop broadcast, a reliable scheme based on mobility estimation for broadcast routing has been proposed.This scheme partitions the neighbors in numerous sets firstly in accordance to the movement route; then to predict the maintain time of all the neighbors, this scheme utilizes the position and velocity; finally multiple rebroadcast based nodes are chosen.In [21], to predict the next path segment, an amalgamation of probabilistic suffix trees (PSTs) and a variable-order Markov model were used. In [22], the authors have proposed and tested algorithms to predict the route of a vehicle from a source to a destination based on global positioning system(GPS) surveillances of the prior vehicles trips. The proposed algorithms utilize the fact for predicting the next segment that a huge section of a typical driver’s tours are repeated. In [9], a routing algorithm that is receiver based(RBRA) was proposed, which makes use of a routing metric.The routing metric considers the remaining lifetime of a link along with the length of each hop. Moreover, vehicles follow a designed mobility model that is designed to identify the motion of moving cars. In [10], the performance of routing protocol based on position in VANET was improved, for this, an intermediate node selection algorithm (INSA) that predicts vehicle motions has been proposed. In [11], the full path vehicle based information collection scheme (FPVBS)-broadcast mode is compared with the conventional data collection schemes and frequent paths are predicted by applying mining on the data collected from this scheme.

3. Logical Behavioral Arrangements Formal Definitions

In order to use logical behavioral arrangements as a way for route estimation, a set of formal definitions is essential to understand the motion behavior of the moving vehicles in VANET. These definitions are customized for characterizing the motion arrangements of the vehicles in VANET.

Definition(a). Letstand for the segment set of the path segments in a particular physical region or map which illustrates the demonstration of path segments and path intersections. Definition (a) provided above can be better recognized in Fig. 1 which demonstrates a capture of a physical region or map for illustrating the demonstration of path segments and path intersections. It can be deduced that a path available in a specific path should be referred as a path segment, whereas the path linked to various directions is all together as a separate path segment. Furthermore, every path intersection can connect two or more path segments simultaneously.

Fig. 1. Path segments and intersections.

Definition(b). Letbe the vehicle set of moving vehicles that are traveling in the particular physical region for the duration of a certain time period.

Definition(c). Letstand for the motion arrangement of the vehicle that illustrates the path segments traversed by the vehicle during its journey in a particular geographical region inside a given map.

The set of motion arrangements of the vehicles during their journeys in a particular geographical region is stored in the motion database of the vehicles, as shown in Table 1. This motion database contains entry of every vehicle traveling in a particular geographical region to illustrate its motion arrangements.

Table 1: Motion database

Definition(d). Logical behavioral arrangements are the frequently occurring ordered arrangements or sub-arrangements as arrangements. A motion arrangementis supposed to be a logical behavioral arrangement or subarrangement of any MP if the order of elements is restricted to the same order as given in the arrangements of MP. For example, here the motion arrangement [S2, S3, S5] is considered to be a sub-arrangement of the arrangements [S0, S1, S2, S3, S5,S6, S8] and [S1, S2, S3, S5, S6] but not a sub-arrangement of [S1,S2, S3, S4, S5, S6, S7]. Although in the last arrangement the elements are presented but they follow a different sequence and consequently they should not be treated as a sub-arrangement.

Definition(e). Support of the motion arrangement MP,symbolized as Support(MP), is described as the number of motion arrangements in motion arrangements database where MP can be recognized as a logical behavioral arrangement or sub-arrangement.

Definition(f). Ifis a motion arrangement: A motion ruleRis defined as an implication of the formwhere MP1and MP2work as the subarrangements of MP and they have different elements (path segments) presenting between MP1and MP2(i.e.,Support of the rule is symbolized as Support(MP)where MP is calculated as the percentage of motion arrangements in motion database which contains MP1∪MP2

(that is, the union of motion arrangements MP1and MP2or taken both MP1and MP2). This is considered as the probability,where the notationspecifies the probability of a motion arrangement that will hold the union of motion arrangement MP1and motion arrangement MP2(that is,it holds each and every path segment in MP1and MP2). The confidence of the ruleRis defined as Confidence(R) whereRis the percentage of the motion arrangements in the motion database holding MP1that also holds MP2. This is considered as the conditional probability. Therefore,

For generating motion arrangements, it is necessary to collect the vehicle’s motion behavior information in a secure manner. After collecting the motion behavior information the rules are produced by using the predefined arrangements which are stored in the database. For example,consider the following arrangement MP=[S2, S3, S5, S6]. The set of possible rules will be,, and.

4. Proposed Data Gathering

By using the scheme based on secure authentication to extract the frequent logical behavioral motion arrangements of the vehicles from a particular source to a particular destination,the collection of beyond mobility facts over a time period is essential. After gathering all the information of paths opted by the vehicles, the most frequently opted paths by the vehicles can be mined by using the frequent arrangement mining approach and can be used as vehicular motion arrangements. In this section, the data gathering scheme based on secure authentication to collect data of the path segments traversed by the vehicles is presented. In this data gathering scheme authentication of the vehicles is executed at the RSU so that the moving vehicle can validate to the RSU that has a license. For enhanced protection, there is a need of the RSU to substantiate the fact that it is certified as well, in order to comprise reciprocal verification. A tricky session key could be set up among the RSU and vehicle for establishing the subsequent communication during authentication. A time honored furtive session key could be required so that it will synchronize changes both at the RSU and vehicle so as to maintain the countermeasures of location confidentiality.

When the mutual authentication of the RSU and vehicle is completed, the vehicle begins to communicate with the authenticated RSU. On the way from a particular source to a destination, the active path table (APt) used for accumulating the information of the new path segments traversed by the vehicle is updated. After a fixed value of threshold is reached,say ThSo, the vehicle starts transmitting the APt to RSU. To achieve the optimal length of message, the threshold is set to be 3, otherwise retransmission for messages lost will become cumbersome in situation of lengthy message and will eventually decrease the network throughput. Once the moving vehicle transmits the information of its new path segments to the RSU in the form of APt, the RSU will add the information of the new path segments to its path lane (PL) list consequently. If a new vehicle originates a message, the identity of the vehicle will be included in the RSU database. On the other hand, if a vehicle with an existing identity in the PL list sends a message,in that case the new path segments mentioned in the message are added to the existing paths information of that vehicle.Algorithm 1 represents the action of moving vehicle in this proposed scheme and Algorithm 2 represents the action of RSU in this proposed scheme. The notations used for writing pseudo code of Algorithms 1 and 2 are mentioned in Table 2.

Algorithm 1: Action of vehicle in the proposed scheme

Table 2: Notation applied in Algorithms 1 and 2

5. Performance Evaluation

Now, the performance evaluation of aforementioned secure authentication based data gathering scheme has been investigated by comparing this scheme with the algorithms proposed in [9] to [11] on the basis of communication overhead, packet delivery ratio, throughput, and latency. To evaluate the performance, the proposed secure authentication based data gathering scheme is implemented in the veins framework. Veins framework acts as an outfit of simulation models for vehicular networking. Veins framework is a combination of two well known simulators: SUMO, a path traffic simulator, and OMNeT++, an incident-based network simulator. The veins framework executes the simulation model by OMNeT++ while interacting with SUMO to provide a comprehensive collection for the inter vehicle communication(IVC) simulation model as shown in Figs. 2 and 3. The extensive simulation is done by using Open Street Map to generate the street network. The number of vehicles is set as 100, 200, 300, 400, 500, 600, 700, 800, 900, and 1000 and the communication range between vehicles and RSUs is set to be a radius of 400 m for every experiment. The experiments are carried out for 900 s while collecting the paths opted by vehicles during their journeys from the source to the destination.

Fig. 2. Scheduled event at RSU.

Fig. 3. Moving vehicles and hit event in SUMO.

The paths opted by the vehicles during their tours from the source to the destination are defined over a given street network for different numbers of vehicles. Motion arrangements of vehicles are generated physically so as to check the reliability of the proposed data gathering scheme. For example, if it is identified that a vehicle has to travel across the sequence of path segmentsduring its tour,then we would like to ensure the information regarding this sequence of path segments being sent and received accurately and appropriately. Once the proposed data gathering scheme is ensured accurately enough for more crowded situations then remaining experiments are focused on the paths that are generated arbitrarily. Simulation parameters used in all the experiments are mentioned in Table 3. Definitions of metrics used and their associated results with the defined metrics along with the justification of these results will be described in the following subsections.

5.1 Communication Overhead

The total number of messages destroyed before arriving at the destination in the execution of one simulation run is termedas communication overhead. Communication overhead is evaluated for the proposed data gathering scheme and the algorithms described in [9] to [11]. It is evaluated by considering all the vehicles and their numbers of messages sent in the given list for experiments under study. Communication overhead is calculated by considering the number of vehicles in the given street network. The parameters for simulating these experiments are given in Table 3. The number of different moving vehicles in the given street networks is used to calculate communication overhead for this set of executed experiments. These executed experiments worked over the time period of 500 s and vehicles count is set to raise from 100 to 1000 by taking a raise step of 100 vehicles after each run of simulation. The number of messages originated from moving vehicles to RSUs and from RSUs to moving vehicles in the given street networks is included in the calculation of communication overhead. Fig. 4 shows the communication overhead gained by the execution of the proposed scheme and the algorithms described in [9] to [11] in the above mentioned street networks corresponding to the different numbers of moving vehicles. The results after simulations confirm that the communication overhead of the proposed data gathering scheme is less than those corresponding to the algorithms described in [9] to [11].

Table 3: General simulation parameters

Fig. 4. Communication overhead with different moving vehicles.

5.2 Packet Delivery Ratio

In this research work, a packet delivery ratio is defined as the ratio of the data packets received by the RSUs to the data packets sent by the vehicles. Fig. 5 shows the packet delivery ratio gained by the execution of the proposed scheme and the algorithms described in [9] to [11], in the above mentioned street networks corresponding to the different numbers of moving vehicles. The results after simulations confirm that the packet delivery ratio of the proposed data gathering scheme is higher as compared with those described in [9] to [11].

Fig. 5. Packet delivery ratio with different moving vehicles.

5.3 Throughput

In this research work, the throughput is defined as the number of data packets moved successfully from the vehicle to RSU in a particular time slot. Fig. 6 shows the throughput attained by the execution of the proposed scheme and the algorithms described in [9] to [11], which confirms that the throughput of the proposed data gathering scheme is larger as compared with those described in [9] to [11].

Fig. 6. Throughput with different moving vehicles.

5.4 Latency

In this research work, latency is defined as the average delay of data packets from vehicles to RSUs. Fig. 7 shows the latency gained by the execution of the proposed scheme and the algorithms described in [9] to [11]. The results demonstrate that the latency of the proposed data gathering scheme is less as compared with those described in [9] to [11].

Fig. 7. Latency with different moving vehicles.

6. Determining the Common and Most Frequent Path

To determine the common and most frequent paths opted by the vehicles during their journeys from a local source to a local destination, data mining is applied on the information of the vehicular paths collected from the aforementioned data gathering scheme depending on their logical behavioral arrangements. With respect to the definitions mentioned in Section 3, confidence and support are two prime factors which may influence data mining of logical behavioral arrangement.

6.1 Support

A collection of arrangements is better known as an arrangement set. An arrangement set which includeskarrangements is ak-arrangement set. The existence of an arrangement set is the count of involved transactions which comprise the arrangement set. It can also be acknowledged as the support count. Relative support is used as the second name for the arrangement set support mentioned in Section 3, on the other hand absolute support is linked to the occurrence frequency. MP is declared as a frequent arrangement set, if a predefined minimum support threshold is satisfied by the relative support of an arrangement set MP. Tables 4 to 10 depict an example for generating frequent motion arrangements with some specified minimum support, i.e., 2. In this example,the motion database contains the entry of every vehicle traveling in a particular geographical region illustrating its motion arrangements. To get MPk, a collection of candidatekarrangement sets is created by combining MPk–1with itself. The generated set of candidates is represented by CMPk. Initially, a set of frequent candidate 1-arrangement sets is determined by inspecting motion database for calculating the count of each arrangement and accumulating those arrangements in the candidate 1-arrangement set (CMP1). From CMP1a new motion arrangement is generated. The resultant set is represented by MP1. MP1is used to get MP2, the set of frequent 2-arrangement sets which is further used to get MP3, keeping on doing this until no more frequentk-arrangement sets are determined. A full search of motion database is necessary to find each MPk. A significant attribute referred as an apriori attribute is used to advance the effectiveness of level wise production of frequent arrangement sets. The apriori attribute is where all the nonempty subsets of frequent motion arrangements must also be frequent and they belong to a frequent arrangement set. This is utilized to reduce the search space.

Table 4: Motion database

Table 5: Generating candidate 1-arrangement sets (CMP1) after scanning motion database

Table 6: Generating MP1 by comparing minimum support count with candidate 1-arrangement sets support count

Table 7: Generating candidate 2-arrangement sets CMP2 by combining MP1 with itself

Table 8: Generating MP2 by comparing minimum support count with candidate 2-arrangement sets support count

Table 9: Generating candidate 3-arrangement sets CMP3 by combining MP2 with itself

Table 10: Generating MP3 by comparing minimum support count with candidate 3-arrangement sets support count

To mine logical behavioral arrangements, a user specified minimum support is required. In this study, a minimum support set for some limited paths opted by the vehicles during their journeys from a source to a destination has been characterized and their frequency rates have been recorded in the collected motion database. In this study, the minimum supports of 2%,4%, 6%, 8%, 10%, 12%, and 14% have been used. The frequent motion arrangements that have been collected by using the proposed data gathering scheme are compared with those collected by the schemes mentioned in [9] to [11] with respect to the aforementioned minimum support. From the results shown in Fig. 8, it can be observed that the proposed data gathering scheme provides more accurate and less frequent motion arrangements. Moreover, it has been noticed that with the increase in the minimum support, the frequency of motion arrangements existing as sub-arrangements in the motion database reduces. This is due to the reason that the number of motion arrangements having support more than the set minimum support value becomes slightly lower when the minimum support threshold is increased, making data more sensitive for extracting frequent arrangements. Hence, in order to identify the most frequent motion arrangements, the minimum support can be defined consequently.

Fig. 8. Number of frequent motion arrangements vs. threshold with the minimum support.

6.2 Confidence

In order to mine logical behavioral arrangements,confidence is another prime factor associated with generated motion rules. Different sets of ordered incidents that can take place based on the happening of certain order of incidents before it is defined by these rules. For all the different subarrangements that arise after extracting the frequent logical behavioral arrangement, motion rules must be produced. The probability of the occurrence of an incident depending on previously ordered incidents is determined by measuring the confidence of specific motion rules. As a descriptive example an intersection is considered as shown in Fig. 9. The rules that have been generated from the intersection shown in Fig. 9 are as follow:

The confidence level of each generated rule depends on the percentage of its occurrence. The confidence of the aforementioned generated rules for the proposed data gathering scheme and the schemes mentioned in [9] to [11] is compared in Fig.10. Results depict that the confidence of the aforementioned generated rules for the proposed data gathering scheme is much better. In Fig. 10, the confidence of Rule 4 for the proposed data gathering scheme is 43% which indicates that 43% of the arrangements have confirmed that vehicles are opting the S4path segment (turn right) after passing through the path segments S1and S2. Depending upon the confidence of rules, one can make a good prediction. For example, in an emergency situation, confidence can be used to inform the ambulance that either to go straight along the road or choose the other best path depending upon the time of day or rush hours, which will reduce the time delay for the ambulance in emergency.

Fig. 9. Segment intersection scenario.

7. Conclusions

Fig. 10. Confidence of generated associated rules.

In this paper, logical behavioral arrangements of vehicles were captured to analyze the information about the paths traversed by the vehicles. For this, data was collected by using the proposed secure authentication based data gathering scheme. The proposed scheme was compared with the schemes mentioned in [9] to [11] and the results show that the proposed scheme outperformed others. Afterwards, the frequent mining approach was used to fetch the common and frequent paths opted by the vehicles during their journeys for path predictions.Association based rules were generated by applying constraints from selected arrangements. Minimum confidence and support were required to accept arrangements for further decision making. The proposed scheme may help the novice to avoid delay by providing a smart way of getting the best path map at a particular period of day and in case of unusual situations such as accident, morning rush hours, and patient having critical condition in ambulance.