A novel approach for feature extraction from a gamma‑ray energy spectrum based on image descriptor transferring for radionuclide identification

2023-01-05HaoLinLiuHaiBoJiJiangMeiZhangCaoLinZhangJingLuXingHuaFeng

Nuclear Science and Techniques 2022年12期

Hao‑Lin Liu · Hai‑Bo Ji · Jiang‑Mei Zhang · Cao‑Lin Zhang · Jing Lu · Xing‑Hua Feng

Abstract This study proposes a novel feature extraction approach for radionuclide identification to increase the precision of identification of the gamma-ray energy spectrum set. For easier utilization of the information contained in the spectra, the vectors of the gamma-ray energy spectra from Euclidean space, which are fingerprints of the different types of radionuclides, were mapped to matrices in the Banach space. Subsequently, to make the spectra in matrix form easier to apply to image-based deep learning frameworks, the matrices of the gamma-ray energy spectra were mapped to images in the RGB color space.A deep convolutional neural network (DCNN) model was constructed and trained on the ImageNet dataset. The mapped gamma-ray energy spectrum images were applied as inputs to the DCNN model, and the corresponding outputs of the convolution layers and fully connected layers were transferred as descriptors of the images to construct a new classification model for radionuclide identification. The transferred image descriptors consist of global and local features, where the activation vectors of fully connected layers are global features, and activations from convolution layers are local features. A series of comparative experiments between the transferred image descriptors, peak information, features extracted by the histogram of the oriented gradients (HOG), and scale-invariant feature transform (SIFT) using both synthetic and measured data were applied to 11 classical classifiers. The results demonstrate that although the gamma-ray energy spectrum images are completely unfamiliar to the DCNN model and have not been used in the pre-training process, the transferred image descriptors achieved good classification results. The global features have strong semantic information, which achieves an average accuracy of 92.76% and 94.86% on the synthetic dataset and measured dataset, respectively. The results of the statistical comparison of features demonstrate that the proposed approach outperforms the peak-searching-based method, HOG, and SIFT on the synthetic and measured datasets.

Keywords Radionuclide identification · Feature extraction · Transfer learning · Gamma energy spectrum analysis · Image descriptor

1 Introduction

Nuclear science and technology are rapidly developing and have been applied in various fields, having increasingly important roles in the sphere of scientific research and production [1—3]. Simultaneously, menacing nuclear weapons and radioactive contamination by nuclear industrial accidents present long-term and significant consequences for the environment, ecology, and biological health [4—6].The detection and identification of radionuclides are crucial tasks under such circumstances [1, 7]. It is important to develop effective algorithms for the detection and identification of radionuclides with stronger discrimination and high accuracy. One of the most critical steps in radionuclide identification is the feature extraction from gamma-ray energy spectra, which is a complicated task owing to the background conditions, energy resolution of the radiation detector, calibration shift, characteristic peak overlap, source strength, and shielding status [8, 9].

Feature extraction methods of traditional radionuclide identification algorithms can be summarized as searching for characteristic energy peaks from the gamma-ray energy spectra and matching them with peaks in the radionuclide library [2, 4, 8—11], which are usually based on physical rules and do not require a training process. Another classical approach is the template matching method, for which the main idea is establishing a template library of the gammaray energy spectra in advance and matching the entire spectrum or transformation of the spectrum with the template in the library [12, 13]. These methods are highly operatordependent, and their limitations are magnified when characteristic peaks are overwhelmed by background noise or several interfering peaks are extracted.

With the development of artificial intelligence, radionuclide identification has gradually become a widely studied classification problem [14—16]. The main idea of classification methods applied in radionuclide identification is to extract features from gamma-ray energy spectra of known types, and train classification models using extracted features; then, the trained model is applied to estimate the probability of existing radionuclides of unknown types [17].Numerous feature extraction algorithms have been used for radionuclide identification, such as the Karhunen-Loeve transform (K-L transform) [18], principal component analysis (PCA) [13, 19], singular value decomposition (SVD)[20], wavelet [21, 22], discrete cosine transform (DCT)[23], and sparse representation [24]. Subsequently, classification methods such as Bayesian [23, 25], extreme gradient boosting tree [26], back propagation neural network[17], artificial neural network (ANN) [27, 28], fuzzy logic[29], long short-term memory (LTSM) [30], convolutional neural network (CNN) [31], and deep convolutional neural network (DCNN) [32, 33] were applied for radionuclide identification. The key factor for the success of these methods is the extraction of strong discriminative features. The limitation of the aforementioned features is that only the relative relationship between the corresponding counts of the front and rear energy addresses is considered, and errors may increase owing to the non-smoothness of the low-count spectra [34—36].

Recent studies have shown that image descriptors transferred by CNNs and DCNNs provide a stable and reliable performance for image classification problems [37—44].Hu et al. [37] reported that features transferred from CNNs were sufficiently generalized to high-resolution remote sensing image datasets and were more expressive than low-level and mid-level features. Babenko et al. [38]experimentally determined that the activation of the top layers of CNNs is competitive despite being trained for unrelated classification tasks such as ImageNet. Moreover,Gong et al. [39] transferred the outputs of the last fully connected layer of a DCNN as an image descriptor, and Razavian et al. demonstrated that features transferred from convolution layers can provide useful global descriptors of specific image regions [40—44]. Liu et al. [44] demonstrated that convolution layers have excellent generalization and efficiency and that transferring convolution layer features can achieve an advanced performance.

This study constructed a novel feature extraction method from gamma-ray energy spectra for radionuclide identification. First, the gamma-ray energy spectra are transformed from a vector to matrix and then to image form. Feature transferring is then performed using a DCNN model. The transferred image descriptors consist of the activations from the convolution layers and activation vectors of the fully connected layers. To verify the effectiveness of the proposed method, 11 classical classification methods were employed to perform a statistical comparison, and the results demonstrate that the proposed method significantly outperforms the peak-searchingbased method, histogram of oriented gradients (HOG),and scale-invariant feature transform (SIFT).

The following two main contributions are presented in this study:

• A novel pre-process method of the gamma-ray energy spectra is proposed. The vectors of the gamma-ray energy spectra are mapped to matrix form, and further mapped to image form. This form conversion can improve the utilization of spectral information and serve as the basis for extracting essential features and constructing a more discriminative classifier.

• Exploring and verifying the application of image descriptors transferring from a DCNN model in the field of radionuclide identification. Experimental results demonstrate that image descriptors can effectively extract the essential features of the gamma-ray energy spectrum images.Local image descriptors transferred from higher convolution layers can provide more discriminative descriptors,and global image descriptors transferred from the first fully connected layer has the strongest semantic information among the fully connected layer.The remainder of this study is organized as follows. In Sect. 2, we introduce the proposed feature extraction approach for radionuclide identification. Section 3 presents a series of experiments using both synthetic datasets and measured datasets from real laboratory environments and offers a comparative analysis. Section 4 concludes the study.

2 Method

The proposed method consists of the following two major steps: (a) Mapping the vectors of the gamma-ray energy spectra from a Euclidean space to matrices in a Banach space, and then mapping the matrices to images in the RGB color space. (b) Constructing a DCNN model trained on ImageNet and transferring the corresponding activation vectors of fully connected layers and activations of convolution layers as global and local features of the gamma-ray energy spectrum images. Fig. 1 presents the procedure of the proposed method.

2.1 Data mapping

In this subsection, the essential features of the different radionuclides are extracted from a novel perspective.

Gamma rays are the products of the de-excitation process of atomic excitation and manifest as short-wavelength electromagnetic radiation. The essence of gamma rays is a stream of particles, which are gamma photons. Gamma photons are uncharged particles, and their interaction with matter is a random event. Collecting gamma photons using a general signal acquisition device is complex, whereas collecting electrical signals is relatively easy; the amplitude of the electrical signal is proportional to the photon energy value. Gamma photons can be converted into electrical signals for signal processing to be collected by a nuclear radiation detector. The principle of this process is that photons emitted by the radioactive source interact with the atoms of the medium in the detector to produce charged particles.The detector collects the particles, converts them into electrical signals, and detects nuclear signals by measuring the electrical signals. The counts distributed with the energy value of the particles can be obtained by scaling the pulse amplitude by the energy; the energy spectrum is the curve of the distribution of the counts with the energy of the particles.As the fingerprints of radionuclides, the energy spectrum contains the distinguishable information of different radionuclides [2—4, 6].

For nuclear events, the count and time of events are random within a certain time interval. In radiation detection,the number of nuclear events measured over a certain period(e.g., detector counts) is also random. Because radioactive decay is a random process, each observation can be considered as a random experiment, and the count per unit time can be regarded as a random variable that obeys the Poisson or Gaussian distributions. For a spherical space with a single point source as the center of the sphere and a certain radius,the process of generating gamma photons by radioactive decay is random and continuous, and photons are uniformly emitted in all directions in space.

The gamma-ray energy spectrumsis a broad stationary random vector in Euclidean space,s={sk}∈Hl. For the convenience of subsequent expressions, letk=0,1,…,l−1 ,whereskis thek+1-th count of photons distributed over the energy value,sk∈ℕ . As previously indicated, the photons generated by radioactive decay are uniformly emitted in all directions in spherical space with a single point source as the center of the sphere and a certain radius. Therefore,skideally positively correlates with the duration of the measurement, as formulated by Eq. (1).

wheretis the duration of the measurement andαkis a parameter over thek+1-th count value affected by the background noise of the environment, the measuring angle and distance,and the intensity of the radiation source.

Fig. 1 (Color online) Block diagram of the proposed method. Mapping the vectors of the gamma-ray energy spectra from a Euclidean space to matrices in a Banach space, then to images in an RGB color space. Constructing a DCNN model trained on ImageNet and transferring the corresponding activation vectors of the fully connected layers and activations of the convolution layers as global and local features of the gamma-ray energy spectrum images, which are subsequently used for classification

Consideringsas a Markov chain or Markov process,which is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event,skis only related tosk−1. In vector-based feature extraction methods,only part of the information ofsis considered, that is, the relative relationship between the corresponding counts of the front and rear energy addresses. This is not conducive to mining the mutual relationship between non-adjacentsk,which will cause difficulties in extracting discriminative and effective features.

Every radioactive decay produces photons of different energy values, and the photon counts of a certain energy obey the Poisson or Gaussian distributions. Thus, the total count of photons generated by radioactive decay per unit time also obeys the Poisson or Gaussian distributions. Letaandbbe the gamma-ray energy spectra obtained by using different measurement durations in a measurement scenario with a fixed background noise, using a specified detector to measure the specified radioactive source at a fixed location and orientation. Therefore, as previously indicated,aandbare vectors from the sameH,a∈Hlandb∈Hl. The corresponding durations of the measurement aretaandtb,respectively.

To efficiently transfer discriminative information for identification from the gamma-ray energy spectra, mapping the gamma-ray energy spectra from a vector form inHto a matrix form in Banach spaceB, the mappingfcan be formulated by Eq. (4).

wherek=0,1,…,l−1.

From the matrix perspective, there are more elements adjacent topij. When extracting features from gammaray energy spectra, not only is the relative relationship between the corresponding counts of the front and rear energy addresses considered, but also the relative relationship between the upper and lower counts and the diagonal counts, which can be easily used and is more conducive to mining the internal and mutual relationships between elements inP.

To applyPas the input of a DCNN model and transfer image descriptors as features of the gamma-ray energy spectra, it is essential to mapPto the image form. MappinggmapsPfrom the matrix form inBto the image form in the RGB color spaceJ, can be formulated by Eq. (6).

Equation (6) maps the element values ofQonto the corresponding pixels of an image with specified colors. Eachqijcorresponds to a rectangular area in the image, and the values ofqijare indices in the Parula color map [45] that determine the color of each patch. Equation (6) maps the smallest value inQto the first entry in the Parula color map and maps the largest value inQto the last entry in the Parula color map.All intermediate values ofQare linearly scaled to the Parula color map in the ascending order. The relationship between the values of the elements inQand the colors of the corresponding pixels in the Parula color map is shown in Fig. 2.

Obviously,

Fig. 2 (Color online) Relationship between the elements and the colormap. The Parula colormap is a three-column array with 64 rows, where each row in the array defines one color using an RGB triplet, i.e.,contains the red, green, and blue intensities for a specific color. Each row in the matrix defines one color using an RGB triplet. The intensities are in the range [0,1], where a value of 0 indicates no color and a value of 1 indicates full intensity

Based on the aforementioned analysis, the effects of different measurement durations onQaandQbcan be ignored under the same measurement conditions. More explicitly,the mappings of the gamma-ray energy spectra in the RGB color space from the same radionuclide would present nearly identical images in an ideal situation, which can reduce the intraclass differences caused by different measuring durations. Even in a real measuring situation, the information processing of mapping the gamma-ray energy spectra from a Euclidean space to a Banach space and then to an RGB color space remains to have a strong applicability and can reduce intraclass differences, thereby helping to extract the essential features from the gamma-ray energy spectra of different radionuclides.

2.2 Feature transferring

In this subsection, a DCNN model was constructed and trained on the ImageNet dataset. The mapped gamma-ray energy spectrum images were applied as inputs to the DCNN model, and the corresponding activation vectors of fully connected layers and activations from convolution layers were transferred as descriptors of images to construct a new classification model for radionuclide identification.

VGG is a widely used convolutional neural network(CNN) model proposed by Karen Simonyan and Andrew Zisserman at the University of Oxford [46]. The VGG has various configurations; instance, VGG-11, VGG-16, VGG-19, etc. Of all the configurations, VGG-16 was identified as the best-performing model on the ImageNet dataset. The basic building block of VGG can be summarized as a stack of multiple (usually one, two, or three) convolution layers with a filter size of 3 × 3, one stride, and one padding, followed by a max-pooling layer of size 2 × 2. Different configurations of this stack were repeated in the network to achieve various depths. The number associated with each configuration is the number of layers with the weight parameters.The convolution stacks are followed by three fully connected layers, two with a size of 4096, and the last one with a size of 1000. The last layer is the output layer with a Softmax activation. A size of 1000 refers to the total number of possible classes in ImageNet. In the method proposed in this study, the structure of the VGG consists ofUconvolution layers,Vmax-pooling layers, andNfully connected layers.The convolution layers use convolution kernels to convolve the input image, and the results of the convolution constitute the feature maps of the input image; thus, the local features of the image are extracted after the convolution layers. The max-pooling layers are arranged after the convolution layers, and the maximum values of the corresponding positions in the feature maps are calculated. This can reduce the dimension of the extracted feature information to make the feature maps smaller, simplify the computing complexity of the network, and avoid overfitting. TheUconvolution layers can be divided intoMgroups using the max-pooling layers as separations. The fully connected layers map the distributed feature representation learned from the convolution layers and max-pooling layers to the sample marker space,the essence of which is to perform a weighted sum of the features to integrate the local features and output them as a value to reduce the influence of the local feature position on the classification.

The convolution layers and max-pooling layers have parameters such as the height of the convolution kernel,width of the convolution kernel, number of input channels,number of output channels (number of convolution kernels), padding, and stride. The padding parameter refers to the boundary padding of the original matrix with′0′before the convolution; thus, the convolution kernel can extend to the pseudo-pixels beyond the edge when scanning the input image, thereby avoiding the loss of edge information. The stride parameter refers to the length of each movement of the convolution kernel, and the stride size affects the efficiency of the model. Each convolution kernel strictly has a bias parameter. To simplify the calculation, the bias is omitted in the following calculation.

VGG is training on the ImageNet dataset to obtain the classification model PM. ImageNet is an image database organized according to the WordNet hierarchy, in which hundreds or thousands of images depict each node of the hierarchy.The dataset has been instrumental in advancing computer vision and deep learning research [47]. LetSbe a set of the gamma-ray energy spectra in Euclidean space,S ⊆Hl,sbe the gamma-ray energy spectrum,s∈S.sis converted intoQusing Eqs. (4) and (6), andQis applied as an input to the PM to transfer features. Generally, PM will only have one final output, and the corresponding activation vectors of the fully connected layers and activations are transferred from the convolution layers in the process as the image descriptors ofQ.

The convolution kernel of PM corresponds to a receptive field, and a small part of the image (local receptive area) is used as the input of the lowest convolution layer, which makes each neuron output by the convolution layer only experience the local image area and does not need to experience the global image. This operation is equivalent to passing through a digital filter to obtain the most salient features of the observed data.In the fully connected layers, different local features from the convolution layers are synthesized through the weight matrix to form a representation of the global information. Therefore,the transferred activation vectors of the fully connected layers are regarded as global feature representations for the gammaray energy spectrum images, and the transferred activation maps from the convolution layers are regarded as local features describing particular gamma-ray energy spectrum image regions.

Specifically, the activated maps transferred from the convolution layers, that is, the set of feature maps transferred from thei-th group of convolution layers of PM, is denoted asci,i∈[1,2,…,M] , which is essentially a set of multiple square matrices.cicontainsd(i)feature maps, whered(i)depends on the number of convolution kernels in one of the convolution layers in thei-th group of convolution layers. The size of the feature map isho×wo, wherehoandwocan be calculated using Eqs. (9) and (10).cican be written as:

2.3 Illustration

In this subsection, we present a simple example of the proposed method. For illustration, we selected two synthetic gamma-ray energy spectras1ands2of60Co . The essence ofs1ands2are two vectors in Euclidean spaceHl, wherel=4096. The simulation settings ofs1ands2are essentially the same, and the only difference is the total number of simulated particles, which is equivalent to the difference in the measurement duration. The number of simulated particles correlates with the duration of the measurement because the process of generating gamma photons by radioactive decay is random and continuous, and photons are uniformly emitted in all directions in space. The simulated particle numbers corresponding tos1ands2are represented ast1andt2,wheret1is five hundred thousand andt2is two hundred and fifty thousand. To illustrate the difference between the two spectra more vividly, we plots1ands2in Fig. 3, where the blue spectrum is the description ofs1, and the red spectrum is the description ofs2. Figure 3 clearly shows differences betweens1ands2due to the duration of measurement, where the counts distributed over the energy addresses of the entire spectrum is significantly more prominent ins1thans2.

To extract more discriminative image descriptors from the gamma-ray energy spectra for identification,s1ands2are mapped to the matrix form using Eq. (4). The mapped matrices are represented asP1andP2, which are in the Banach spaceBm×n, wherem=n= 64. To reduce the discrepancy betweenP1andP2caused by the difference in the measurement duration, and to facilitate the matrices as the inputs of the DCNN for the transfer of image descriptors,P1andP2are mapped to the image form using Eq. (6), and the results are represented byQ1andQ2. The gamma-ray energy spectra of137Cs and152Eu were randomly selected for comparison, and their correspondingQCsandQEuvalues were obtained using Eq. (4). The intraclass similarity and interclass distinction of the different gamma-ray energyspectra are shown in Fig. 4 vividly. Specifically,Q1andQ2exhibited nearly identical images in Fig. 4a and b, which indicates that the difference betweens1ands2in Fig. 3 is significantly reduced whens1ands2are mapped to RGB color space. By comparing Fig. 4c, d with Fig. 4a—d clearly have completely different characteristics than Fig. 4a, b. Namely,the mapping of the gamma-ray energy spectra of the randomly selected137Cs and152Eu in the RGB color space exhibits completely different characteristics from those of60Co . Through the aforementioned comparison and analysis in Sect. 2, the effects of different measuring durations onQ1andQ2can apparently be ignored under the same measuring conditions. Therefore, the information processing of mapping the gamma-ray energy spectra from Euclidean space to Banach space and then to the RGB color space remains to have a strong applicability and can reduce intraclass differences, thereby helping to extract the essential features from the gamma-ray energy spectra of different radionuclides. A further quantitative analysis is presented in Sect. 3.

Table 1 The overall flow of the proposed method

Fig. 3 (Color online) Two original spectra of 60Co . The simulation settings of s1 and s2 are essentially the same, and the only difference is the total number of simulated particles t1 and t2 , which is equivalent to the difference in the measurement duration

In the feature transfer phase, the structure of VGG and the corresponding feature sizes are listed in Table 2, which consists of five groups of convolution layers and two fully connected layers. The layer number only considers the number of convolution layers, and the last layer of each group is the max-pooling layer. The number of channels in the feature size is not changed in the convolution layers, but changes in the max-pooling layer owing to the size of the stride parameter [the number of output channels in different groups can be calculated by Eqs. (9) and (10)]. For fully connected layers, the number of output channels is equal to the number of neurons in the fully connected layers. The partial parameters of the convolution, max-pooling, and fully connected layers are shown in Table 3.

Fig. 4 (Color online) Gammaray energy spectrum images. a and b are the images of Q1 and Q2 , c and d are the images of QCs and QEu

Table 2 The structure of VGG

Table 3 Partial parameters of VGG

The VGG framework was trained on the ImageNet dataset to obtain the classification model PM.Q1andQ2are applied as input images for the PM to transfer features,and the activation vectors of the fully connected layers and activations from the convolution layers in the process are transferred as the image descriptors ofQaandQb. Image descriptors transferred from the fifth group of convolution layersc5and the first fully connected layerf1were selected as illustrations.c5andf1from the synthetic and measured datasets are shown in a low-dimensional space through the t-SNE [48] in Fig. 7. Each color in the figure represents one type of radionuclide, which intuitively reflects the difference in the expressive and discriminative abilities between the different features. The transferred image descriptors have a strong intraclass similarity and interclass differentiation, and a further analysis of the feature performance is presented in the next section.

3 Experiments and analysis

This section introduces the acquisition and preprocessing of the synthetic data and the measured data and establishes a series of experiments using 28 classification methods based on the Weka machine learning toolkit to verify the feasibility of image descriptors transferred from the gamma-ray energy spectrum for radionuclide identification. Based on the previous experiments, statistical comparisons of features through nonparametric and Friedman tests were conducted to verify whether image descriptors transferred from DCNNs can be used as an essential feature representation for gamma-ray energy spectrum images.

3.1 Data preparation

The production of the synthetic dataset consisted of the following three steps: (1) Background data acquisition. A selfmade 3-inch NaI detector was used to measure the ambient background. The measurement was conducted in a laboratory environment without the presence of a separate radioactive source. The detector was then placed at a fixed position for 12 h. Two measurements were performed, one with lead bricks placed around the detector and the other without; two sets of background data were obtained. (2) Single-nuclide energy spectrum acquisition. Based on the Geant4 platform,the transport process of the radioactive gamma-ray particles of 26 single nuclides was simulated using the Monte Carlo method. We constructed simulated scenarios in Geant4 containing only different types of single radioactive point sources and a 3-inch NaI detector. The positions and relative distances of the radioactive point sources and detector were fixed. A total of one million photons were simulated,and their trajectories were recorded by the detector and converted into the gamma-ray energy spectrum. (3) Data synthesis. To simulate real measurements as closely as possible,two sets of background data and 26 sets of synthetic spectra were linearly superimposed with a random signal-to-noise ratio (SNR). SNR=Nnc∕Nbg, whereNncis the sum of the photon counts emitted by the radioactive point source andNbgis the sum of the photon counts of the background. In the linear superposition process, the SNR value is a random number ranging between 0.3 and 1. A total of 2080 synthetic gamma-ray energy spectra of 26 radionuclides were obtained and named as dataset 1. Table 4 demonstrates a list of common radionuclides, which includes a total of 26 radionuclides in the following four categories: SNM, industrial,medical, and NORM.

The measured gamma-ray energy spectra were obtained from radioactive sources in a laboratory environment with lead brick shielding. The measurements took advantage of a cadmium zinc telluride (CZT) gamma-ray spectrometer from Kromek. The spectrometer has 4096 measuring channels; the measurable energy ranges between 25 keV and 3.0 MeV, and the electronic noise is lower than 10 keV. Two types of V radiation sources,137Cs and60Co ,and one type of IV radiation source,152Eu were used in the measurement. The spectrometer was carried on the Turtlebot robot to quantitatively control the measuring distance and reduce radiation damage to the experimental operators. As shown in Table 5, seven groups of samples were established; the measuring object, measuring distance, and measuring duration were varied in the process.A total of 150 measured gamma-ray energy spectra of three single radiation sources and a total of 200 measured gamma-ray energy spectra of four mixed radiation sources were obtained and named dataset 2. Figure 5 presents the spectrum of60Co in different sample sets. Figure 6 presents the gamma-ray energy spectrum images in different sample sets.

Table 4 Radionuclide library of synthetic sample set

Table 5 Grouping of gamma-ray energy spectrum samples

3.2 Feature performance comparison and analysis

Fig. 5 Examples of synthetic and measured spectra. a displays a 60Co synthetic spectrum, and b displays a real measured 60Co spectrum

Fig. 6 (Color online) Examples of synthetic and measured gamma-ray energy spectrum images. Group a are synthetic gamma-ray energy spectrum images of 26 radionuclides, group b are measured gamma-ray energy spectrum images of 7 single and mixed radionuclides

Research have shown that image descriptors transferred from DCNNs can provide a reliable performance for image classification problems. However, choosing different features for a specific classification domain remains worth discussing. For the domain of the radioactive gamma-ray energy spectral classification presented in this study, and considering the limitation of the computing resources and the scale of parameters, VGG-16 was chosen as the DCNN framework owing to the characteristic that VGG-16 applies a significantly small 3 × 3 receptive field (filters) throughout the entire network with the stride of 1 pixel. A combination of multiple 3 × 3 filters and nonlinear activation layers can replace a receptive area of a larger size, which makes the decision functions more discriminative to the characteristics of the spectrum. This imparts the ability of the network to converge faster [46]. In addition, the consistent use of 3 ×3 convolutions across the network makes the network significantly simple, elegant, and conveniently transfers image descriptors.

In this subsection, 28 classification methods based on the Weka machine learning toolkit [49] were applied in a series of experiments to verify whether image descriptors transferred from a DCNN model can be used to construct a classification model with a strong discrimination and advanced accuracy. Owing to the large number of transferred features(five sets of local features and two sets of global features for a single energy spectrum), in an actual training process,using only a certain set of features is sufficient to build a satisfactory classifier. Therefore, in the experiments described in this subsection, single sets of local and global features and combinations of high-level features were used to conduct the experiments. Multiple groups of transferred image descriptors from the synthetic and measured datasets were applied in the training and testing processes of the 28 classification models. All classification models were trained with the default parameters and settings specified in the toolkit, and the experiments applied the tenfold cross-validation method to avoid the imbalance caused by random data segmentation.The percentage of misclassified cases for transferred features on the synthetic dataset and measured dataset are listed in Tables 6 and 7. The significance of bold in the tables is the percentage of misclassified cases that performed well in each experiment.

The results demonstrate that although the gamma-ray energy spectrum images are completely unfamiliar to the DCNN model and have not been used in the pre-training process, the transferred image descriptors achieved good classification results.

For local image descriptors, a higher group of convolution layers can provide more discriminative descriptors than a lower group of convolution layers. Part of the multi-group union image descriptors can provide better resolution, and different arrangements of image descriptors from the same groups also have an impact on the results. The best average classification effect of the local image descriptors on the synthetic dataset was the multi-group union image descriptors composed ofc3,c4, andc5, which was 93.08%. Local image descriptors achieved the best average classification effect on the measured datasetc5, which was 93.51%. The best classification effects of the local image descriptor on the synthetic and measured datasets were 100.00% and 99.71%,respectively.

For the global image descriptors,f1, which is transferred from the first fully connected layer, has strong semantic information, achieving an average accuracy of 92.76%and 94.86% on the synthetic dataset and measured dataset,respectively. The best classification effects of the global image descriptors on the synthetic and measured datasets were both 100.00%. In contrast, the classification model trained on the global image descriptors transferred from the second fully connected layer achieved a poor classification performance. The aforementioned experiments have preliminarily proved that image descriptors transferred from a DCNN model can be applied as the image descriptors of radionuclide gamma-ray energy spectrum images for classification; however, further comparisons of the features of gamma-ray energy spectra and different image descriptors of the gamma-ray energy spectrum images are needed.

3.3 Statistical comparison

Characteristic peaks are the most important features of traditional radionuclide identification methods. Four groups of characteristic peaks were extracted from the gamma-ray energy spectra by changing the peak properties, that is, distance, prominence, width, and threshold. Scale-invariant feature transform (SIFT) [50] and a histogram of oriented gradients (HOG) [51] are two classical feature extraction algorithms used in computer vision. Two groups of features extracted from the gamma-ray energy spectrum images were obtained using the HOG and SIFT algorithms. Fig. 7 demonstrates various features of the synthetic and measured data sets in a low-dimensional space through the t-SNE[48], which including characteristic peaks, features extracted by HOG and SIFT, present local image descriptorsc5and global image descriptorsf1. As shown in Fig. 7, the distribution of the characteristic peaks is chaotic, and no apparent clustering center is observed for the different radionuclides.While HOG and SIFT features can form apparent clustering centers in certain radionuclides, the features of these radionuclides exhibit a significant crossover. Simultaneously, the transferred local and global image descriptors were significantly enhanced in feature discrimination, and the aggregation between similar feature points was also stronger.

The aforementioned features from one- and two-dimensional spectral images were applied in 21 classification methods using the Weka machine learning toolkit [49]. All experiments applied the tenfold cross-validation method to avoid the imbalance caused by random data segmentation.The proportions of misclassified samples in the experiments are listed in Tables 8 and 9. These results provide a basisfor evaluating the performance of the transferred features.However, from a statistical point of view, these results do not provide strong support for identifying a group of features.For a statistical comparison of features, the nonparametric test method recommended by Demšar [52] was considered for comparing multiple features using several classification methods. First, we tested whether there were significant differences among the seven groups of features. The Friedman test was then applied to compare several features of the multiple classification methods.

Table 6 Percentage of misclassified cases for transferred image descriptors on synthetic data set

Assuming that Num types of classification methods andkgroups of features are involved in this experiment, the implementation of the Friedman test assigns a valuerjito each group of features, which indicates the rank of the featurejon thei-th classification method. The average rankRof featurejwas computed using Eq. (13).

The null hypothesis is rejected if the size ofχ2Fexceeds the critical value, which indicates that there is a statistically significant difference between the classifiers. Conversely, the null hypothesis is accepted if the size ofχ2Fdoes not exceed the critical value. A post hoc test was applied to determine the nature of the differences and compare the relative performances of different features when the null hypothesis was rejected.

Fig. 7 (Color online) Visualization of various features by t-SNE. Each color in the figure represents one type of radionuclide, which intuitively reflects the difference in the expressive ability and discrimination ability between different features

Table 8 Percentage of misclassified cases for various features on synthetic data set

Table 9 Percentage of misclassified cases for various features on measured data set

In our experiments, the significance levelαvalue was set to 0.05, the corresponding critical value of dataset 1 was calculated as 2.2541 whileχ2F= 44.8442, and the corresponding critical value of dataset 2 was calculated as 2.2541 whileχ2F= 2.6218. The result of the Friedman test indicated that the null hypothesis was rejected; therefore,further Holm tests [52] were required to compare the performance of the different features.

The test statistics of the Holm test for comparing features in a pair-wise manner are formulated in Eq. (15).

Using thezvalue obtained from Eq. (15), the corresponding probabilitypcan be determined from the normal distribution table and is compared with the appropriateαto determine whether the hypothesis is rejected, that is, whether the proposed model is better than a certain classifier. Letp1,p2,…, denote the ordered corresponding probabilitypvalues; thus,pfrom all the features have the following relationship:p1≤p2≤…≤pk−1. The step-down procedure of the Holm test begins with the most significantpvalue and compares eachpiwith the adjustedα, which is calculated usingα∕(k−i).

If the adjustedαcorresponding to the feature,α∕(k−1) ,is higher thanp1, the corresponding null hypothesis is rejected and indicates that both features have the same performance, and subsequently proceeds to comparep2with the corresponding adjustedα,α∕(k−2) . If the second null hypothesis remains rejected, the experiment continues to test the third null hypothesis, and so on. Once a certain null hypothesis is accepted, all remaining null hypotheses are preserved.

For Dataset 1, the Holm procedure rejects the first, second, third, fourth, and fifth hypotheses because the correspondingpvalues are smaller than the adjustedα. Thus,the final hypothesis cannot be rejected. This indicates that the transferred image descriptors perform significantly better than the characteristic peaks searched by prominence,width, and threshold, and features extracted by HOG and SIFT at the significance level ofα=0.05. The transferred image descriptors were not significantly better than the characteristic peaks searched for by distance. For Dataset 2, the Holm procedure rejects all the hypotheses, which indicates that the transferred image descriptors perform significantly better than the characteristic peaks and features extracted by HOG and SIFT at the significance levelα= 0.05. The filtering basis of searching the characteristic peaks by using the distance removes peaks with smaller distances by defining the minimum horizontal distance between adjacent peaks until all remaining peaks satisfy the distance condition.This peak-finding method is effective for the synthesized gamma-ray energy spectra using the NaI detector and has a larger classification error in the measured gamma-ray energy spectra using the CZT detector. Because the resolution of the NaI detector is not excellent, and the synthetic energy spectra comes from an ideal environmental simulation, the simulated energy spectrum has fewer interference peaks,while the measured gamma-ray energy spectra fluctuate significantly.

The results of the aforementioned comparative experiments prove that image descriptors transferred from DCNNs are better than the characteristic peaks extracted from the gamma-ray energy spectra and the HOG and SIFT shallow features extracted from the gamma-ray energy spectrum images. These transferred image descriptors are essential,discriminative, and can provide a reliable performance for gamma-ray energy spectrum image classification problems.

4 Conclusion

This study proposes a novel feature extraction approach for radionuclide identification to facilitate the extraction of structural and essential features and increase the precision of identification on the gamma-ray energy spectrum set.

The results of a series of comparative experiments between the proposed method, peak-searching-based method, HOG, and SIFT using both synthetic and measured data demonstrate the following conclusions. (1) The information preprocessing of the proposed method, that is,mapping the gamma-ray energy spectra from a Euclidean space to a Banach space and then to an RGB color space,is significant for extracting the essential features, which can reduce the intraclass differences caused by different measuring durations. (2) The feature transfer process of the proposed method, that is, transferring the corresponding activation vectors of fully connected layers and activations from convolution layers in the process from DCNNs as image descriptors, can effectively extract the essential features of gamma-ray energy spectrum images. (3) Local image descriptors transferred from higher convolution layers provide more discriminative descriptors. (4) The global image descriptors transferred from the first fully connected layer had the strongest semantic information among the fully connected layers. (5) The proposed method outperforms the peak-searching-based method, HOG, and SIFT on synthetic and measured datasets.

Future studies will focus on exploring the available value of other DCNNs in the field of radionuclide identification,exploring more feature fusion methods and aggregationapproaches to develop more powerful descriptors, and establishing more universal datasets to further advance the research process of radionuclide identification.

Table 10 Results of statistical comparison

Author contributions All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Hao-Lin Liu, Cao-Lin Zhang and Xing-Hua Feng. The first draft of the manuscript was written by Hao-Lin Liu and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Nuclear Science and Techniques

2022年12期