APP下载

Distinguish Fritillaria cirrhosa and non-Fritillaria cirrhosa using laser-induced breakdown spectroscopy

2021-08-05KaiWEI魏凯XutaiCUI崔旭泰GeerTENG腾格尔MohammadNoumanKHANandQianqianWANG王茜蒨

Plasma Science and Technology 2021年8期

Kai WEI (魏凯), Xutai CUI (崔旭泰), Geer TENG (腾格尔),Mohammad Nouman KHAN and Qianqian WANG (王茜蒨)

1 School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, People’s Republic of China

2 Key Laboratory of Photonic Information Technology, Ministry of Industry and Information Technology,Beijing Institute of Technology, Beijing 100081, People’s Republic of China

Abstract As traditional Chinese medicines, Fritillaria from different origins are very similar and it is difficult to distinguish them.In this study,the laser-induced breakdown spectroscopy combined with learning vector quantization (LIBS-LVQ) was proposed to distinguish the powdered samples of Fritillaria cirrhosa and non-Fritillaria cirrhosa.We also studied the performance of linear discriminant analysis,and support vector machine on the same data set.Among these three classifiers, LVQ had the highest correct classification rate of 99.17%.The experimental results demonstrated that the LIBS-LVQ model could be used to differentiate the powdered samples of Fritillaria cirrhosa and non-Fritillaria cirrhosa.

Keywords: laser-induced breakdown spectroscopy (LIBS), learning vector quantization,chemometric models, robustness of model

1.Introduction

Fritillariabelongs to botanical medicine and has great medicinal value, which is used to moisten the lungs, relieve cough, reduce swelling, and remove phlegm [1].The therapeutic effects ofFritillariafrom different origins are different [2].Fritillaria cirrhosa, originated in Sichuan, is the treasure ofFritillariaand often used in clinical applications.Its price is the highest among all types ofFritillaria.Some illegal merchants use non-Fritillaria cirrhosato pretend to beFritillaria cirrhosain the market.The fruits ofFritillaria cirrhosaand non-Fritillaria cirrhosacan be identified using morphological identification methods.However, whenFritillariafruits are grounded into powders for use in medicine,they cannot be identified using morphological methods[3,4].

Currently, some methods such as random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), DNA barcode, and express sequence tags (ESTs) are commonly used to identify the powdered samples ofFritillaria[5].However,these methods have their limitations.The identification results of different primers are not comparable, hence, it is difficult to standardize using the RAPD technology.The AFLP technique needs to prepare high purity DNA, which is not suitable for large-scale analysis and identification [6].It is difficult to identify related species using the DNA barcode technology [5].Due to the need for reverse transcriptase and cloning technology, EST technology is extremely difficult to operate [6].Moreover,these techniques are often performed under laboratory conditions.To develop a fast,in situmethod used in field, we proposed to use laser-induced breakdown spectroscopy(LIBS) to identify the powdered samples ofFritillaria cirrhosaand non-Fritillaria cirrhosa.

LIBS has unique advantages such as high speed,in situ,micro-destructiveness, remote sensing capability, and simultaneous multi-element analysis [7], which has also been successfully applied in metals[8–10],plastics[11],glass[12],fingerprints [13], rocks [14, 15], plant tissue [16, 17], biological tissue [18, 19] and so on.

In the field of traditional Chinese medicine(TCM),some researchers have also done a lot of research on LIBS.Donget al[20] analyzed ten elements including Mg, Al, Si, P, Ca,Ti, Mn, Fe, Co, and C inOriental Water Plantain Rhizomeusing LIBS.Liuet al[21]extracted the feature lines of LIBS spectra of four types of Tibetan medicines, namelyRenqing Mangjue,Renqing Changjue, 25-herb coral pills, and 25-herb pearl pills.When detecting heavy metals in TCM, Liet al[22] detected Pb inCoptis chinensisusing LIBS, and also determined the optimum experimental parameters.Wanget al[23] detected Cu inCoptis chinensis,aconite root, andporia cocosusing LIBS.These above-mentioned studies mainly focused on analyzing the element information of TCM.However, to our knowledge, few studies have been performed to classify the powdered samples ofFritillaria cirrhosaand non-Fritillaria cirrhosausing LIBS technology.

In this study, LIBS combined with learning vector quantization (LIBS-LVQ) was proposed to distinguish the powdered samples ofFritillaria cirrhosaand non-Fritillaria cirrhosa, and the powders ofFritillaria thunbergiiandFritillaria pallidiflora,originated in Zhejiang and Xinjiang,were selected as samples of non-Fritillaria cirrhosa.As far as we know, LVQ has not been used in LIBS data analysis.LIBS combined with LVQ was used in the classification of TCM for the first time.As a comparison, we compared the classification results between the proposed method and the commonly used classifiers, linear discriminant analysis (LDA),and support vector machine(SVM).The correct classification rate (CCR) was used as an indicator to evaluate the performance of classifiers.

2.Learning vector quantization

The LVQ network proposed on the basis of competitive network structure is a supervised self-organizing neural network [24, 25].In the process of network learning, the supervised signals are added as the classification information to fine-tune the weights, and the output neurons are pre-specified.The LVQ neural network realizes the effective combination of competitive learning and supervised learning,which can achieve good results in classification problems [26].

The structure diagram of the LVQ network is shown in figure 1,which consists of three layers of neurons,namely the input layer, the hidden layer (competition layer), and the output layer [27].The input and hidden layers are fully connected, while the hidden and output layers are partially connected.Each hidden layer neuron is only connected to one output layer neuron, and the connection weight is fixed at 1;and each output layer neuron is connected to multiple hidden layer neurons.

When a vector is input, the weights of the winning neuron are fine-tuned.In the repeated competition learning,the weights corresponding to the hidden layer neurons are gradually adjusted to the cluster centers of the input sample space.When a hidden layer neuron is activated, its output state is 1, whereas the other hidden layer neurons have the output state of 0.Therefore, the state of the output layer neuron connected to the activated hidden layer neuron is 1,and the state of the remaining output layer neurons is 0.The output layer neurons (y1,y2, ···,yn) correspond to different types, thus achieving pattern recognition.

The steps of the LVQ network learning algorithm are as follows:

Step 1.Inputting the sample vector

The vectorx=[x1, ···,xm]Tis input to the input layer.

Step 2.Network initialization

Learning rate η (η > 0) and the maximum number of iterations are set.The weightswijbetween the input and hidden layers are initialized to the midpoint of the input vectors.

Step 3.Looking for winning neuron

The distance between the input vector and the hidden layer neuron weight vector is calculated as follows:

wherewijrepresents the weight between theithinput layer neuron and thejthhidden layer neuron.The hidden layer neuron with the smallest distance is selected as the winning neuron, which is denoted ashj*.

Step 4.Updating connection weights

The weights of the winning neuron are adjusted according to different rules.When the network classification result is consistent with the expected classification result, the formula of adjusting the weight is as follows:

when the network classification result is inconsistent with the expected classification result, the formula of adjusting the weight is as follows:

the weights of other non-winning neurons remain unchanged.

Step 5.Judging the number of iterations

The iterative process ends when the pre-set maximum number of iterations is reached;otherwise,it returns to step 3 to enter the next round of learning.

3.Experimental setup and materials

Figure 1.Structure diagram of LVQ network.

Figure 2.Schematic diagram of the experimental setup.

Fritillaria cirrhosa,Fritillaria thunbergii,andFritillaria pallidiflora, bought from the Bozhou TCM trading center,were used as samples in the experiment.FiftyFritillariafruits were purchased for each sample.The samples were grounded into powders using a TCM pulverizer (model: 800Y).Next,the powdered samples were glued to glass slides using double-sided tapes, as shown in figure 3.140 spectra were collected for each sample, each on a fresh position.100 spectra were used to build the model, and 40 spectra were used to test the model.

4.Results and discussion

4.1.LIBS spectra

The typical LIBS spectra of each type of sample and doublesided tape are shown in figure 4.It can be seen from figure 4 that the intensities of some metal elemental lines are different from these three kinds ofFritillaria.For example, the intensity of Ca 422 nm in the spectrum ofFritillaria thunbergiiis greater than those in the spectra ofFritillaria cirrhosaandFritillaria pallidiflora.The intensities of Na 588 nm and Na 589 nm in the spectra ofFritillaria cirrhosaandFritillaria thunbergiiare greater than those in the spectrum ofFritillaria pallidiflora.The intensities of K 766 nm and K 769 nm in the spectra ofFritillaria thunbergiiandFritillaria pallidifloraare greater than those in the spectrum ofFritillaria cirrhosa.These macro metal elements inFritillariaare derived from the soil.The content and proportion of metal elements in soil from different regions are different.So, the content of macro metal elements inFritillariafrom different origins is also different.The corresponding wavelengths and the energy levels of these metal elements are listed in table 1.

The LIBS spectra ofFritillariacontain elemental lines of Ca, Na, K, as well as molecular bands of CN and C2.The LIBS spectrum of the double-sided tape contains CN and C2molecular bands.In order to avoid interference from the LIBS spectrum of the double-sided tape, CN and C2molecular bands were not used forFritillariaclassification.We selected seven spectral lines with an intensity greater than 1000 for classification.

The integral intensities of these seven spectral lines were calculated as the inputs of the classification models.In order to eliminate the fluctuation of spectra between each laser shot,we chose the maximum intensity line, K I 766.49 nm, to normalize the LIBS data.

We first used the principal component analysis(PCA)to analyze the LIBS spectra of powderedFritillariasamples and observe the distribution of data.PCA is an unsupervised clustering method that has been applied in many fields[28–35].The scores of the first three principal components(PCs) of 100 spectra of each type of sample (300 spectra in total)are shown in figure 5.The accumulated variance of the first three PCs is 96.021%(PC1 44.536%;PC2 36.077%;PC3 15.407%).Figure 5 shows a significant overlap among these three types of data.The powdered samples ofFritillaria cirrhosa,Fritillaria thunbergii,andFritillaria pallidifloraare difficult to be distinguished using PCA.It can also be seen from figure 4 that the LIBS spectra ofFritillaria cirrhosaand non-Fritillaria cirrhosaare very similar.

Figure 3.Powdered samples of(a)Fritillaria cirrhosa,(b)Fritillaria thunbergii, and (c) Fritillaria pallidiflora.

4.2.Identification of Fritillaria cirrhosa and non-Fritillaria cirrhosa

The powders ofFritillaria cirrhosaand non-Fritillaria cirrhosacould not be distinguished by the unsupervised method PCA.We tried to use some supervised methods including LVQ, LDA, and SVM to identify the powdered samples ofFritillaria cirrhosaand non-Fritillaria cirrhosa.

The CCR was used as an indicator to evaluate the performance of classifiers which was calculated using the following formula:

This was my third year selling fireworks for the Chaparral High School Band Booster Club, and I took pride in my knowledge of these treats for the eyes and ears. Thanks to my son, I know what every one of these does or at least what it was designed to do.

in a classification process, the output has only two possibilities: positive (P) or negative (N).In our case, P corresponded toFritillaria cirrhosa, andNcorresponded to non-Fritillaria cirrhosa.There were four possible results for the binary classifier.A true positive (TP) or a false positive (FP)was observed if the predicted output wasFritillaria cirrhosaand the actual input wasFritillaria cirrhosaor non-Fritillaria cirrhosa, respectively.Conversely, a true negative (TN) or a false negative (FN) was observed if the predicted output was non-Fritillaria cirrhosaand the actual input was non-Fritillaria cirrhosaorFritillaria cirrhosa, respectively [36].

Seven normalized characteristic spectral lines were used as inputs of the model,and two types of output corresponded to two different species.100 spectra ofFritillaria cirrhosaand 200 spectra of non-Fritillaria cirrhosa(100 spectra ofFritillaria thunbergiiand 100 spectra ofFritillaria pallidiflora) were used to build the model.40 spectra ofFritillaria cirrhosaand 80 spectra of non-Fritillaria cirrhosa(40 ofFritillaria thunbergiiand 40 ofFritillaria pallidiflora) were used to test the model.

Figure 4.Typical LIBS spectra of (a) Fritillaria cirrhosa, (b)Fritillaria thunbergii, (c) Fritillaria pallidiflora and (d) doublesided tape.

Table 1.Selected elements of LIBS spectra.

We used the control variable method to optimize the number of the hidden layer neurons,the learning rate and the number of iterations of LVQ model.The particle swarm optimization algorithm was used to find the optimalcandgof SVM model.The LDA model has no parameters to be optimized.These classification models were used to classify the powdered samples and the optimal parameters, test time, and CCRs of these models are listed in table 2.

LDA is a linear classifier.The CCR of LDA model was 97.5%.SVM can achieve linear and nonlinear classification by changing the kernel functions.When we used the nonlinear kernel function-radial basis kernel function,the CCR of SVM was 98.33%.In our case, this was a nonlinear case,which was suitable to be solved by a nonlinear method.LVQ is a nonlinear classifier that uses supervised learning to train competitive networks.Among these three classifiers,although the test time of LVQ was longer than those of SVM and LDA,the CCR of LVQ was the highest of 99.17%, and the identification result was the best.It indicated that LVQ was the most suitable classifier for our experimental data.

4.3.Test for LVQ robustness

To test the robustness of LVQ model to cope with the unknown samples not included in the training set [37],Fritillaria thunbergiiandFritillaria pallidiflorawere used as non-Fritillaria cirrhosarespectively to establish two models.100 spectra ofFritillaria cirrhosaand 100 spectra ofFritillaria thunbergiiwere used to build model I.100 spectra ofFritillaria cirrhosaand 100 spectra ofFritillaria pallidiflorawere selected to build model II.40 spectra ofFritillaria cirrhosa, 40 ofFritillaria thunbergii,and 40 ofFritillaria pallidiflorawere selected as the test set for model I and model II.The optimal model parameters and test results are shown in table 3.

In table 3, for modeling withFritillaria cirrhosaandFritillaria thunbergii,the optimal model parameters were the hidden layer neurons of 5, learning rate of 0.01, and number of iterations of 500.The classification result was 99.17%.For modeling withFritillaria cirrhosaandFritillaria pallidiflora,the optimal model parameters were obtained as follows: the hidden layer neurons of 5, learning rate of 0.09, and number of iterations of 800.Using the optimal model, the identification result was also 99.17%.

We used different training sets to train model I and model II and the same test set to test the data.In model I,one LIBS spectrum ofFritillaria thunbergiiwas erroneously classified asFritillaria cirrhosa,and so was model II.Although part of the test set was not included in the training set, the CCRs of LVQ model were the same as those of the LVQ model established with these kinds of samples as a training set.The experimental results showed that LVQ had good robustness.

Figure 5.Scores of the first three principal components of Fritillaria cirrhosa, Fritillaria thunbergii, and Fritillaria pallidiflora.

Table 2.The optimal parameters, test time, CCRs of Fritillaria cirrhosa and non-Fritillaria cirrhosa discrimination models.

5.Conclusions

This research mainly focused on the feasibility of LIBS technology to distinguishFritillaria cirrhosaand non-Fritillaria cirrhosa.The obvious LIBS emission lines of Ca,Na,K as well as the molecular bands of CN and C2could be observed from the LIBS spectra ofFritillariapowder samples.This indicated that the LIBS technology could well characterize the elemental composition ofFritillaria cirrhosaand non-Fritillaria cirrhosapowder samples.

LIBS combined LVQ was proposed to distinguish the LIBS spectra of the powdered samples ofFritillaria cirrhosaand non-Fritillaria cirrhosa.Compared with the performance of LDA, and SVM models, LVQ had the best classificationresult of 99.17%.Moreover, the LVQ model showed good robustness,when part of the test data was not included in the training set,and the CCR was still 99.17%.The experimental results demonstrated that the proposed method could be used in identifying the powdered samples ofFritillaria cirrhosaand non-Fritillaria cirrhosaand had great application potential in medical drug identification.

Table 3.Test for LVQ robustness and identification results of Fritillaria cirrhosa (FC) and non-Fritillaria cirrhosa (NFC) (non-Fritillaria cirrhosa includes Fritillaria thunbergii (FT) and Fritillaria pallidiflora (FP)).

Acknowledgments

This work is supported by National Natural Science Foundation of China (No.62075011)and Graduate Technological Innovation Project of Beijing Institute of Technology (No.2019CX20026).