APP下载

Fault diagnosis method of train control RBC system based on KPCA-SOM network

2020-04-28LIYangqingLINHaixiang

LI Yang-qing, LIN Hai-xiang

(1. School of Automation and Electrical Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China;2. Rail Transit Electrical Automation Engineering Laboratory of Gansu Province,Lanzhou Jiaotong University, Lanzhou 730070, China)

Abstract: Radio block center (RBC) system is the core equipment of China train control system-3 (CTCS-3). Now, the fault analysis of RBC system mainly depends on manual work, and the diagnostic results are inaccurate and inefficient. Therefore, the intelligent fault diagnosis method of RBC system based on one-hot model, kernel principal component analysis (KPCA) and self-organizing map (SOM) network was proposed. Firstly, the fault document matrix based on one-hot model was constructed by the fault feature lexicon selected manually and fault tracking record table. Secondly, the KPCA method was used to reduce the dimension and noise of the fault document matrix to avoid information redundancy. Finally, the processed data were input into the SOM network to train the KPCA-SOM fault classification model. Compared with back propagation (BP) neural network algorithm and SOM network algorithm, common fault patterns of train control RBC system can be effectively distinguished by KPCA-SOM intelligent diagnosis model, and the accuracy and processing efficiency are further improved.

Key words: radio block center (RBC) system; fault diagnosis; self-organizing map (SOM); kernel principal component (KPCA)

0 Introduction

Radio block center (RBC) is the ground core equipment of China train control system-3 (CTCS-3), and it is the important guarantee for fast and safe operation of high-speed railway. According to the statistics of the RBC fault tracking records of Guiyang Railway Station from January 2016 to December 2017, it is found that there are 8 RBC accidents per month, and the operation speed of the high-speed railway is greatly restricted. Therefore, the realization of intelligent fault diagnosis of train control RBC system is of great significance on promoting train safety research.

Now, equipment fault diagnosis of RBC system mainly relies on manual experience and data monitoring system. As mentioned in Refs.[1-3], the fault scope and cause can be identified by analyzing RBC driving log manually, but maintenance personnel need to master the vehicle-ground information transmission process and message meaning, so the accuracy and efficiency of diagnosis are limited by the proficiency of individual professional skills. The realization of data monitoring system needs technicians to analyze a large number of monitoring data, which is difficult and inefficient[4]. Although control theory and artificial intelligence have achieved remarkable results in the field of railway system fault diagnosis, such as Bayesian[5-6], expert system[7]and neural network[8-9], there are few methods for intelligent fault diagnosis of train control RBC system. At present, only one intelligent fault diagnosis method of RBC based on case-based reasoning (CBR) has been proposed by Guo et al.[3]and Zhang[10], but it has not been widely used because of its huge knowledge base and slow case search speed[11].

In view of the shortcomings of CBR, the self-organizing map (SOM) network for small sample training is proposed to construct a classifier in this paper[12]. As an unsupervised neural network, SOM can intuitively maintain the original topological structure of sample vectors without specifying the type of input vectors, which has good self-organization, self-adaptation and robustness[13]. Moreover, it is different from the traditional neural network. Better classification performance can be obtained by using small quantity of training samples to train the network[14-16]. The data source selected in this paper is fault tracking record table of the train control RBC system. Although the record is effective, it is limited by the field condition and the technical knowledge of personnel, and the integrity and comprehensiveness of the fault record can not be guaranteed. So it is difficult to obtain a large number of complete fault information. Therefore, the small sample characteristics of SOM network also conform to construct the fault diagnosis classifier based on train control RBC fault tracking record table. In order to improve its fault diagnosis efficiency, the kernel principal component analysis (KPCA)-SOM model is proposed for fault diagnosis of RBC system.

1 Research object and relevant theories

1.1 Research object

RBC is the core equipment of high-speed railway train control system. According to the signal authorization received from the interlock system[12,17]and the location report sent by the train, the operation authorization for each train under its jurisdiction is generated by RBC system and sent to the train to realize the safe operation of the train. In order to facilitate the maintenance of RBC system, the RBC system is equipped with local terminal, maintenance terminal, judicial recorder and other equipments.The system structure is shown in Fig.1.

Notes: ISDN: integrated service data network; CTC: decentralized autonomous dispatching centralized system; CSM: centralized monitoring system; GSM-R: network-railway integrated digital mobile communication system; JRU: judicial record unitFig.1 Schematic diagram of train control RBC system

1.2 KPCA

The main idea of KPCA is to transform the samples nonlinearly. It realizes the non-linear principal component analysis in the original space by analyzing the samples in the low-dimensional space, so as to represent the original data set information with the least number of features, thus realizing the purpose of data dimension reduction. The input data matrixXn×mis mapped to the high-dimensional feature spaceH={G(X)} by means of the non-linear kernel functionG, wherextis thet-th sample of the input data matrix. The covariance matrix of high-dimensional feature space is

(1)

The key of KPCA is to find the mapping direction which can represent the characteristic variance features of the original data matrix to the greatest extent. The formula is expressed as

ζR=UR,

(2)

whereζis the characteristic value,Ris the mapping direction which can represent the characteristic variance of the original data matrix to the greatest extent.

Thus, the mapping formula of the data in the original data sample (x1,x2,…,xn) is

(3)

1.3 SOM network

SOM network is a competitive neural network composed of fully connected neurons with the characteristics of unsupervised and self-learning. Its two layers are input layer and competition layer (i.e. output layer). The number of input layer neurons of classical SOM network isn, and the number of competition layer neurons iss×d, which can be expressed as a two-dimensional planar array. The network structure of the classical SOM network is shown in Fig.2.

The learning process of SOM network is as follows:

1) The input vectorsX(i.e. input datx1,x2,…,xn)in the input layer and the corresponding weight vectorsWiof individual neurons in the competition layer are normalized so that the modulus ofXandWiis 1.

Fig.2 SOM network structure

2) The weight vectors corresponding to all neurons in the competition layer are compared with the input vectors obtained by the network. The neurons corresponding to the weight vectors with the highest similarity are the winning neurons. The similarity depends on the Euclidean distance between the input vector and the neuron. The smaller the Euclidean distance, the higher the similarity. The Euclidean distancedijbetween thei-th input vector of the mapping layer and thej-th neuron is calculated by

(4)

wherexiis thei-th input vector;wijis the weights of thei-th input vector and thej-th neuron.

3) Adjust the connection weights between the winning neurons and the adjacent neurons. The adjustment formula is

Δwij=∂Dkj(xi-wij),

(5)

whereDkjis the domain function andkis the competitive winning neuron of the current input vectorxi.

The expression of the domain function is

(6)

wherehkis the position of the winning neuron;hjis the position of thej-th neuron;δ2is the variance that decreases gradually with the progress of learning.

4) Judge whether the learning is terminated or not. For all the input vectors in the training process, if the corresponding winning neurons do not change, that is to say, the network converges, then the learning is terminated. Otherwise return to step 2) to continue learning.

The learning process of SOM network shows that the weights and thresholds of the winning neurons and other neurons in their neighborhoods are adjusted, which makes the SOM network have good learning and generalization ability.

2 Fault diagnosis model of RBC based on KPCA-SOM network

In this paper, fault information is mined by manually selecting fault feature lexicon combined with fault tracking record table of train control RBC system. Firstly, the fault document matrix taken as initial data sample is established by using one-hot model. Then, based on KPCA, dimension reduction and noise reduction of data samples are carried out to avoid redundancy of fault attributes. Finally, the data sets are randomly divided into training data and testing data, which are input into SOM network successively, and the fault diagnosis model of train control RBC system based on KPCA-SOM network is established. The model building block diagram is shown in Fig.3.

Fig.3 Fault diagnosis model of train control RBC system based on KPCA-SOM network

2.1 Fault pattern table

The fault tracking record table of train control RBC system is recorded in the form of natural language. Table 1 is the partial example of the fault tracking table, which is intercepted from the RBC fault tracking record table of Guiyang Railway Station from January 2016 to December 2017. Because only the fault phenomena and the corresponding fault patterns of the record table are paid attention in this paper, Table 1 deletes the irrelevant items in the original fault tracking record table.

Table 1 Fault tracking record table of train control RBC

By analyzing the fault tracking record table of RBC of Guiyang Railway Station, the common fault patterns are summarized as Table 2.

Table 2 Common fault patterns table for RBC system

2.2 Fault feature lexicon

When using one-hot model to represent fault tracking records of RBC, it is necessary to establish a standard RBC fault feature lexicon. Because fault feature words of RBC are not universal in Chinese document lexicon, expert knowledge is needed to build the lexicon. In theory, every term in the fault record can be used to represent the fault information, but the information such as train number, time and location has no practical significance on the determination of the fault pattern. Therefore, these terms are deleted when the feature lexicon is established. Finally, a total of 86 feature terms were selected, as {local terminal, unlimited timeout, mobile authorization, JRU, level conversion, downgrade, CTCS-2, front car, rear car, ..., emergency brake}.

2.3 Fault document matrix and construction of fault data sample base

After selecting the fault feature lexicon, the one-hot model is used to represent the fault record. When one-hot model is used for text representation, firstly, it extracts non-repetitive feature words from the original text dataset to form a vocabulary containingVfeature words. Then a fault record is represented by a vector withVdimension. When the the value ofm-th dimension (m=1,2,…,V) is 1, it indicates that them-th feature word in the vocabulary appears in the fault record of this item; when the value is 0, it means that it does not appear. The premise of using the one-hot model is to assume that the feature words in the fault record table are independent of each other, that is, the exchanging order of feature words in the fault record does not affect the fault diagnosis. The fault document matrix can be obtained as shown in Table 3, wherenis the fault record number andwis the fault feature word.

Table 3 Fault document matrix

Because the feature words of the selected feature lexicon may contain redundant information, these redundant feature attributes will increase the network complexity and slow down the training speed. Therefore, in order to construct the standard fault data sample library, it is necessary to reduce the fault document matrix by using the KPCA method.

The KPCA realizes the nonlinear projection from the input space to the high-dimensional feature space by the inner product operation of the kernel functions. The commonly used kernel functions are sigmoid kernel function and Gaussian kernel function. Because the radial basis function (RBF) kernel function has the characteristics of simple process and good classification performance, the RBF kernel function is selected and its expression is

(7)

whereσis the width parameter of the function, and its value has a great influence on the performance of KPCA. Therefore, when KPCA is used to reduce feature dimension, it is necessary to optimize the selection of kernel width parameter to improve the separability of feature data. The optimization process ofσis as follows:

2) The intra-class and inter-class distances ofk-class kernel principal components are respectively defined as

(8)

(9)

3) The smaller the intra-class distance and the larger the inter-class distance of the feature data in different classes, the better the separability of the feature data. The optimization function ofσis

(10)

When max(H) is obtained, its value is the optimal parameter ofσ.

Compared with the fault document matrix, the feature terms deleted from the fault data sample library after dimensionality reduction isD={front car, rear car, demarcation point, initialization, on-line, redundancy, safety data network, cabinet, system, train control center, log file, indicator light, logic, on-board equipment, communication, switching unit and on-board vehicle, software, brake}. In the end, there are 65 feature terms in the sample database of fault data, while there are 86 feature terms in the fault document matrix.

3 Simulation experiment and result analysis

3.1 Network structure design and simulation experiment

The SOM network is constructed according to the sample database of fault data. The reduced 65 feature attributes are used as input vectors of the network. The competition layer is set to 5×5=25 neurons, and the training step is set tot=[10,50,100,150,200,300,500].

A set of data is selected from each fault pattern in the sample database to form a standard fault sample matrix ofA11×65(i.e. 11 kinds of standard fault samples), which is used to train SOM network. Whenttakes different values, the classification results of SOM networks are shown in Table 4. The different numbers in Table 4 represent the serial numbers of different neurons. And C1, C2,…,C11 are codes for different fault patterns as shown in Table 2.

Table 4 Classification effect of different training steps t

By analyzing Table 4, whent=10, fault pattern C5, C6 and C9 are classified into one class, C7 and C8 are classified into one class, C10 and C11 are classified into one class. At this time, the SOM network has a preliminary classification effect on standard fault samples. Whent=50, the network can further distinguish C10 and C11. Whent=100, C7 and C8 are also divided into different categories. Whent=150, the network can completely distinguish 11 fault patterns. It can be seen that the accuracy of network fault classification is improved with the step-by-step increase of training steps. When the training steps of the network continue to increase (e.g. 200, 300 and 500, respectively), the standard fault samples are also classified into different classes, but compared witht=150, the increase oft-value will sacrifice the training efficiency of the network, so the optimal value oftis 150.

It can be seen from Table 4 that when the training step is 150, the serial number of the winning neurons of the fault patterns C1 to C11 areV={2, 5, 15, 11, 8, 9, 17, 19, 10, 21, 23}. In Fig.4, there are 25 hexagons and each hexagon represents a neuron. The numbers of neurons is numbered from bottom to top and left to right, respectively, from 1 to 25. For example, the position of neurons corresponding to category C1 is 2, that corresponding to category C2 is 5, and that corresponding to category C5 is 8, and so on. It can be seen that the network distinguishes the fault patterns corresponding to 11 kinds of standard fault samples.

Fig.4 Competitive winning neurons when t=150

3.2 Comparison and analysis of simulation results

In this paper, convergence steps, absolute error and accuracy are selected to evaluate the classification performance of the network comprehensively. In order to demonstrate KPCA-SOM network has better classification performance in the case of small samples, it is compared with back propagation (BP) neural network[9]and ordinary SOM network. Through many experiments, the comparison results shown in Table 5 are obtained.

By analyzing Table 5, it can be seen that:

1) Compared with BP neural network model, the absolute error of ordinary SOM network model is reduced by 3.18% and the accuracy is increased by 2.59%. That is to say, the absolute error of SOM network model is reduced and the accuracy is improved.

Table 5 Comparison of simulation results

2) When the competition layer neurons adopt the structure of 5×5, the convergence step of KPCA-SOM network is reduced, the average absolute error is reduced by 2.78%, the training time is reduced by 3.87 s, and the accuracy is improved by 4.65%, compared with the ordinary SOM network. It shows that dimensionality reduction of fault documents by KPCA is helpful to improve the training efficiency and diagnostic accuracy of SOM network.

3) Using KPCA-SOM model, the absolute error and accuracy of the model are similar when the structure of competition layer neurons are 4×4 and 5×5, respectively. But when using 5×5 structure, the network can converge faster, the training time is reduced by 1.3 s and the training efficiency is higher. It indicates that increasing the number of neurons properly can improve the training efficiency of the network.

4 Conclusions

1) Due to the information redundancy problem in manually selected fault feature words inventory, the main features of each fault pattern can be extracted by using KPCA, so as to improve the training efficiency and diagnostic accuracy of the network.

2) The KPCA-SOM network model algorithm is realized through the simulation experiment of actual fault data. The experimental results show that the fault diagnosis method has a good ability of automatic fault identification in train control RBC fault diagnosis, which shows that the method can be realized in engineering and has certain engineering application value.

3) By comparing KPCA-SOM model with ordinary SOM model and BP model, it can be seen that KPCA-SOM model has better classification performance than BP neural network and ordinary SOM model when the data samples are small.

4) Based on the fault tracking record table of train control RBC system, an intelligent fault diagnosis method based on one-hot model, dimensionality reduction by KPCA and self-organizing mapping (SOM) network is proposed. Taking the fault records of Guiyang Railway Station from January 2016 to December 2017 as data samples, the feasibility and validity of the KPCA-SOM fault diagnosis model proposed in this paper are verified, which provides ideas for further research on the optimization of intelligent fault diagnosis methods for train control RBC system.