APP下载

DNA barcoding in herbal medicine: Retrospective and prospective

2023-06-26ShilinChenXianmeiYinJianpingHanWeiSunHuiYaoJingyuanSongXiwenLi

Journal of Pharmaceutical Analysis 2023年5期

Shilin Chen ,Xianmei Yin ,Jianping Han ,Wei Sun ,Hui Yao ,Jingyuan Song ,Xiwen Li

a Institute of Herbgenomics,Chengdu University of Traditional Chinese Medicine,Chengdu,611137,China

b Key Laboratory of Beijing for Identification and Safety Evaluation of Chinese Medicine,Institute of Chinese Materia Medica,China Academy of Chinese Medical Sciences,Beijing,100700,China

c State Key Laboratory of Southwestern Chinese Medicine Resources,Chengdu University of Traditional Chinese Medicine,Chengdu,611137,China

d Institute of Medicinal Plant Development,Chinese Academy of Medical Sciences & Peking Union Medical College,Beijing,100193,China

Keywords:

Herbal identification

Super-barcode

Mini-barcode

Meta-barcoding

Cutting-edge barcoding

ABSTRACT

DNA barcoding has been widely used for herb identification in recent decades,enabling safety and innovation in the field of herbal medicine.In this article,we summarize recent progress in DNA barcoding for herbal medicine to provide ideas for the further development and application of this technology.Most importantly,the standard DNA barcode has been extended in two ways.First,while conventional DNA barcodes have been widely promoted for their versatility in the identification of fresh or well-preserved samples,super-barcodes based on plastid genomes have rapidly developed and have shown advantages in species identification at low taxonomic levels.Second,mini-barcodes are attractive because they perform better in cases of degraded DNA from herbal materials.In addition,some molecular techniques,such as high-throughput sequencing and isothermal amplification,are combined with DNA barcodes for species identification,which has expanded the applications of herb identification based on DNA barcoding and brought about the post-DNA-barcoding era.Furthermore,standard and high-species coverage DNA barcode reference libraries have been constructed to provide reference sequences for species identification,which increases the accuracy and credibility of species discrimination based on DNA barcodes.In summary,DNA barcoding should play a key role in the quality control of traditional herbal medicine and in the international herb trade.

1.Introduction

Herbal medicine plays a significant role in disease prevention and therapy.In the struggle with coronavirus disease 2019(COVID-19),some traditional Chinese medicines have shown obvious curative effects and have been recommended by the World Health Organization (WHO) [1].However,approximately 4.2% of herbal medicines in the market contain substitutes or adulterants [2].Accurate authentication of herbal medicine is related to patient safety and herbal efficacy.Researchers have attempted to solve this problem by original,morphological,microscopic,physical and chemical identification.However,species identification has been an unsolved problem for thousands of years.The emergence of DNA barcoding has allowed herb identification to enter the era of molecular identification,thereby overcoming the limitations of traditional identification methods,and allowed accurate identification of materials with similar morphologies and chemical structures [3].Chen et al.[4] established an identification system for herbal medicine using ITS2+psbA-trnHfor plant materials and cytochrome-c oxidase subunit 1 (COI) + ITS2 for animal materials,which accelerated standardization for molecular identification in herbal medicine.With new techniques emerging in recent years,such as high-throughput sequencing (HTS) technology and isothermal amplification,herb identification based on DNA barcoding has undergone rapid development and entered the post-DNA-barcoding era.Although several candidate loci and different combinations have been proposed,none of them work well across all species[5,6].Taxonomists have thus proposed that a multi-locus or even the complete chloroplast genome (cp-genome) sequence could serve as a super-barcode for herb identification [7-13].

Accurate identification of mixed materials is particularly urgent and of practical significance in herbal medicine.The clinical use of traditional Chinese medicine is mostly as Chinese patent medicines(CPMs).The accuracy of the original materials is very important to ensure the safety and effectiveness of CPMs.However,there are limitations in the ability of traditional identification methods to distinguish processed herbal medicine,as they may miss original features.Based on the DNA barcoding database,researchers have attempted to determine the origin of CPMs by meta-barcoding,which provides a valuable way to monitor the quality of CPMs.In addition,mini-barcodes developed from conventional barcoding sequences can also be used to identify the origin of targets with severe DNA degradation,such as old specimens,CPMs,heat-treated foods and gastric contents.Based on the DNA barcoding sequences,some isothermal techniques were developed and used to identify herbal medicines on site.Species-specific DNA barcodes,i.e.,highly variable sites,were screened from the whole plastid genome sequence for the identification of closely related species.

Here,we review the history of plant barcodes,evaluate the advantages and limitations of currently used barcodes and discuss the application prospects of DNA barcoding in herbal medicine.In addition,we discuss the use of cutting-edge technology to optimize and improve DNA barcode-based herb identification (Fig.1).The prospective development and applications of DNA barcoding in herbal medicine are also discussed in this review.

Fig.1.Application of DNA barcoding and its derivative technology in the identification of medicinal resources.HRM: High-resolution melting; LAMP: loop-mediated isothermal amplification; KASP: kompetitive allele-specific polymerase chain reaction; HTS: high-throughput sequencing.

2.Conventional DNA barcodes

DNA barcoding is regarded as an efficient tool for herbal medicine identification.Single loci or multiple loci have been widely used and have provided sufficient resolution for the identification of most herbs.It is the current main molecular identification method but still has some limitations.For example,due to low sequence divergence and multifaceted speciation patterns involving recent radiations and hybridizations,the discriminative efficiencies of DNA barcoding inRhododendronand tropical woody bamboos were relatively low[14].Taking the morphology,physical and chemical properties of the species into account can overcome the deficiency of conventional DNA barcodes for the identification of closely related species.In addition,when the conventional DNA barcodes of the specimens and the origin of CPMs are difficult to obtain,the mini-barcoding or meta-barcoding technique based on HTS can meet this demand.

2.1.Single-locus barcodes

We searched the literature with the keywords “DNA barcode,identification”.In 2010,ITS2 was proposed as an efficient barcoding tool for medicinal plants in an evaluation of seven DNA regions,including chloroplast genes and nuclear genes,for the authentication of various medicinal plants and counterfeits [6].ITS2 correctly identified more than 90% of 4,800 species in 753 genera due to high variability and species discrimination.Yao et al.[15]subsequently proposed ITS2 as a DNA barcode for plant and animal medicine based on an investigation of delimitation ability in 50,790 plants and 12,221 animals.At the species level,the success rates of the ITS2 region in identifying different taxonomic groups were 67.1%-91.7% [15].ITS2 remains the best and most widely used single-locus barcode for the delimitation and identification of herbs,as it is easy to amplify and has enough variability to distinguish even closely related species[2,16-18].In addition,despite the multiple-copy sequence,the intra-genomic distances of ITS2 are distinctly smaller than those of intra-specific or inter-specific variants[19,20].In general,species can be sufficiently identified using ITS2,while the identification ability of ITS2 DNA barcodes differs among species and taxonomic levels [21].In view of the above,Li et al.[20] additionally evaluated DNA barcodes for different taxa and suggested that ITS can be used as the core barcode of seed plants (79% in angiosperms).Hebert et al.[22] suggested that the mitochondrialCOIgene can be used as a DNA barcode for animal authentication.Since then,the identification effect ofCOIhas been well verified in fish [23],reptiles [24,25],insects [26] and other groups.Additionally,COIis extensively used in the identification of animal drugs in Chinese medicine,such as snake medicines,seahorse medicines,and ungulate medicines[27-29].TheCOIgene is the preferred target DNA region in animal species identification.Although other mitochondrial (cytb,16SrRNA,and18SrRNA) and nuclear genes have also been used for animal identification,designing universal primers forcytbmay be difficult for some animals due to a high level of sequence variation [30].16S and 18S sequences are relatively conserved in a wide range of taxa[30]and thus can be used to discriminate species with deep intra-specific divergences whereCOIlacks high resolving power [30].In addition,in a previous study,although the 16S rDNA gene did not perform well in the identification of some species,COIand 16 rDNA performed well in discriminating neogastropod genera[31].Schoch et al.[32]evaluated four markers as DNA barcodes for 742 strains or specimens representing 226 fungal species and suggested that with the barcode gap between inter- and intra-specific variations,ITS had the highest identification efficiency for the broadest range of fungi.Additionally,Vu et al.[33]evaluated the suitability of ITS and ribosomal large subunit (LSU) for identification of filamentous fungal strains.ITS performed better than LSU in the recognition of filamentous fungal species,with success rates of 82% vs.77.6%,respectively.

2.2.Multi-locus barcodes

To date,many studies have proven that it can be problematic to find an ideal DNA barcode for all species.Single-locus barcodes can support most species discrimination tasks but not all.The combination of two or more barcodes usually improves discriminative efficiency in several special lineages for which single-locus barcoding is insufficient.Therefore,rbcL+matKwas recommended by the Consortium for the Barcode of Life(CBOL)Plant Working Group as the multi-locus barcodes of plants,and many researchers have also proposed different options.Pang et al.[34] evaluated four combinations of DNA barcodes containing 2,190 sequences from 586 species.The multi-locus barcodetrnH-psbA+ITS2 showed the highest identification efficiency in 41 of the 47 families.As one of the supplementary barcodes,psbA-trnHis a variable region and can be easily amplified in a wide range of terrestrial plants[35].Tripathi et al.[36] tested the efficacy of five barcodes for 300 accessions of tropical tree species and recommended ITS+trnH-psbAas the best combination.Liu et al.[37]also discovered that ITS/ITS2+psbA-trnHhas higher potential value as DNA barcode for Apiaceae identification with large samples.Xu et al.[38]evaluated the identification ability of 11 candidate barcodes ofDendrobiumspecies,and ITS+matKshowed the highest discrimination rate(76.92%).Chen's group[15]proposed that applying ITS2 as a core barcode andpsbAtrnHas a complementary locus could enable accurate identification of medicinal plants,andCOI+ITS2 was a better combination for the identification of some medicinal animals,such as cnidarians,because ITS2 sequences have a higher divergence rate thanCOIin cnidarians.Carew et al.[39] found thatCOIsequences successfully identified 96% of macro-invertebrate species,but when combined withcytbsequences,the success rate increased to 99%.This combination has performed well in the identification and revelation of new species of many lineages[40-42].Phylogenetic analyses based on the combination of ITS,LSU and ribosomal small subunit (SSU)provided reliable and robust resolution for arbuscular mycorrhizal fungi from the phylum to species level.

2.3.Construction of the DNA barcode reference library

DNA barcoding is a promising innovative approach for species authentication and ecosystem biomonitoring.However,its effectiveness depends largely on the reliability and coverage of the reference library,as well as the identification of primer sets working on the broadest taxa.A good library should contain enough species and their closely related species,and each species should be represented by sufficient individuals with almost all intra-specific variations,which can increase the accuracy and credibility of species identification (Fig.2).There are three main types of libraries: exclusive libraries,regional libraries and broadspectrum libraries.Exclusive libraries work for only one or a few taxa and are characterized by high coverage and high quality.These libraries are usually constructed to fill existing gaps in barcode libraries[43].In 2014,Chen's group[4]established a DNA barcoding database based on ITS2 andpsbA-trnHbarcodes (http://www.tcmbarcode.cn) for the first time,and a species identification module was provided for Chinese medicine.The database covered more than 95% of medicinal materials in pharmacopeias,from China,Japan,Europe,Korea,USA and India[4].Gong et al.[44]built a reference library ofAmomumvillosumLour.Populations for genetic diversity and structure analysis.Regional libraries are constructed based on comprehensive sampling of a specific administrative division or geographical region.Chen's group [45]has built a reference library for monitoring herbal medicine in Japanese,Korean and Brazilian pharmacopoeias.In 2018,Gong et al.[46] constructed a library based on ITS2 for herbs in southern China,which provided a resource for the molecular identification of southern Chinese medicine.Vassou et al.[47] created a reference library used in Ayurvedic medicine.Gill et al.[48] built a plant barcoding library for the semi-arid East African savanna by using more than 1,500 specimens.BOLD and the GenBank database are the main broad-spectrum databases for DNA barcoding.From these reference samples,correlation sequences were generated,and searched in each of the two databases.While GenBank outperformed BOLD in species-level identification of insect taxa,GenBank performed equally well in plants and macrofungi [49].These databases also contain barcoding data for counterfeit and closely related species.However,barcoding libraries remain poorly developed for organisms that are challenging to identify morphologically [50].

Fig.2.Common DNA barcoding applications in traditional medicines.The frequencies applied are indicated by different colors,blue indicating high application frequency and yellow indicating low application frequency.COI:cytochrome-c oxidase subunit 1;LSU: large subunit; SSU: small subunit.

3.Super-barcode: genome-based identification for closely related species

Conventional DNA barcodes lack sufficient variation at the gene level,and currently,no barcode can work across all species [51].Genome-based identification was proposed and showed higher discriminatory ability in situations where general DNA barcodes could not provide accurate authentication,especially for closely related species [11].Super-barcoding is a genome-based identification method that uses the complete plastid genome as a DNA barcode.Since the idea was proposed in 2008,many studies have confirmed its effectiveness and feasibility at lower taxonomic levels[52].Due to the high sequencing cost and difficulties to obtain whole cp-genome sequences compared with single-locus DNA barcodes (ITS2,matK,etc.),plant super-barcoding is used almost exclusively for distinguishing fresh plant samples or dried leaves instead of herbal materials or decoction pieces.Although superbarcoding is becoming more widely accepted,a challenging trade-off exists between the high cost and the quantity of cpgenome sequences in databases,especially in cases where repeated samples are lacking.Therefore,super-barcoding is not always recommended if commonly used DNA barcodes can achieve the purpose of identification [53].Specific barcodes are thus generated and adopted in plant identification when standard barcodes do not offer accurate authentication and super-barcodes are beyond the requirements of experiments for some laboratories,especially for a given taxonomic group [51].However,the advantages of super-barcoding are clear in cases where traditional DNA barcodes are limited in species identification at lower taxonomic levels.

With the development of next-generation sequencing (NGS)and the increase in multiple demands for species identification,super-barcodes may differentiate not only between closely related species but also between populations or individuals.The strategy or the level of data analysis of super-barcoding will change based on the purpose of application.

3.1.Genome recognition

Despite some initial criticisms,such as the propensity for hybridization,super-barcoding has become widely recognized and has had a strong influence on species identification.Zhang et al.[54] performed a phylogenetic analysis ofDracanaand demonstrated the utility of the cp-genome as a super-barcode in lowerlevel genetic studies.Xia et al.[55] and Chen et al.[56] verified the discriminatory ability of super-barcodes inChrysanthemumandLigularia,respectively.However,these super-barcoding studies used few species and almost no intra-specific samples.Wu et al.[53]evaluated the feasibility of using the cp-genome as a super-barcode to discriminate different species fromFritillariaon a large scale.The study confirmed that using the cp-genome as a DNA barcode to identify species was generally straightforward.Although many preliminary studies have verified the potential of super-barcodes,there is still some evidence that super-barcoding does not always allow for the identification of all plants [57].As a result,some researchers have returned their attention to nuclear genes.The use of the plastid genome combined with rDNA was gradually entered adopted in the field of species identification.Genome skimming was used as a universal extended barcode[58].However,these data were analyzed at the genomic level.Kane et al.[10] used whole cp-genomes and 6 k bases of nuclear ribosomal DNA as an “ultra-barcode” to distinguish cacao,including nine individuals ofT.cacaoand one individual ofT.grandiflorum.They demonstrated that ultra-barcoding is a viable method for reliably distinguishing varieties and even individuals of different genotypes of cacao.Genome skimming based on low-coverage shotgun sequencing of total DNA can recover complete plastid genome sequences,eliminating the need to pursue whole cpgenome sequences such as traditional super-barcodes.Fu et al.[59] performed genome skimming to recover near-complete plastid genomes and rDNA forRhododendronspecies with a large number of samples and compared the discrimination success with that of multi-locus DNA barcodes.They considered the demonstrable increase in discriminatory power to be due to extensive plastid genome data.

To find a suitable standard DNA barcode,we need to balance polymerase chain reaction (PCR) amplification and high rates of sequence divergence.In addition,sufficient extracted DNA and sequence alignment are also key issues.Super-barcoding may circumvent some of the limiting factors described above.First,gene categories in the cp-genome usually differ between species in different plant groups.Species identification can also be performed based on the presence or absence of a specific gene in two species.This was considered to be the simplest method for species identification [60].Second,plastid gene order is not always the same in different plant groups,which provides new insights into plant identification [61,62].Genome recognition gives less attention to variation among sequences.As in face recognition technology,extracted key genome characteristics can be representative and have enough information for species authentication.The improvement of NGS and bioinformatics has provided an increasing number of cp-genome sequences for genome-based species identification.Since the first cp-genome was sequences in 1986 [63],the total number of cp-genome sequences increased rapidly.As of July 2022,there were 7,467 complete plant cpgenomes in the Genbank database (Fig.3).However,these resources have not yet allowed widespread species identification.The number of new cp-genomes published in the last two years exceeded half of the total number sequenced in the previous 24 years.In particular,the percentage of genera with sequences available for more than three species reached 25%.Genome recognition will eliminate tedious sequence alignment and provide a simple and fast technique for plant species identification(Fig.4).

Fig.3.The total number of chloroplast genome sequences (data extracted from Genbank database) and the calculation of single and multiple species in the same genus.

Fig.4.Schematic timeline of history and possible developments of plant super-barcoding.cp-genome: chloroplast genome.

3.2.Pangenome-based species identification

The phylogenetic resolution of clades is unlimited,ranging from phylum and family to genus,species and beyond.For medicinal plants,species identification not only includes inter-species authentication but also covers discrimination at lower taxonomic levels,mainly focusing on identifying populations and phenotypes in different distribution areas.Samples of the same plant species from different geographic environments may generate different chemical substances that perform different functions.Traditional DNA barcodes have difficulties in discerning such different populations and individuals.Super-barcoding uses the full-length cpgenome as a super-gene and overcomes the limitations of commonly used standard DNA barcodes at the subspecies level.Wu et al.[53] confirmed that a super-barcode could successfully distinguish different species fromFritillaria.What's more,it also reflected the biogeographic characteristics of species in this genus.Their super-barcoding results from cluster analysis were coincident with geographical distribution and chemical analysis results.Super-barcoding has the potential to distinguish species at the population level and even the ecotypes of individuals.

Since the origin of the pangenome [64],researchers have paid increasing attention to the dispensable genome,which represents the difference in genome elements in a given species or clade.Advances in long-read sequencing technologies make it possible to assemble individual genomes into a reference pangenome,which addresses the issues in individual discrimination.With the development of molecular-assisted breeding technology,the need for individual identification will become increasingly urgent.Selecting individual medicinal plants with good characteristics and performing whole-genome sequencing to find effective markers will shorten the breeding period for group screening in the field.A reference pangenome is very helpful for identifying target individuals in cultivated mixed groups.In addition,to protect the patent rights of new varieties,pangenome-based authentication can be performed by examining gene presence/absence instead of sequence similarity [19] (Fig.4).With the development of herbal genomics [20] and generation of more genome-wide fine maps[65-69],pangenome-based identification will greatly facilitate resource discovery,breeding and quality improvement of traditional Chinese medicinal materials.

4.Mini-barcoding:species identification of materials or CPMs

4.1.Proposal of DNA mini-barcodes

DNA barcoding has been extensively applied in the species determination of animals and plants.However,in practical applications,it is frequently found that complete fragments of conventional DNA barcodes are difficult to amplify successfully from targets with severe DNA degradation,such as old specimens,CPMs,heat-treated foods and gastric contents [70-72].In animal specimens that have been stored for more than 2 years,the intactCOIbarcode can be obtained from only 39% of individuals,while in more than 90% of the samples,segments less than 300 bp can be successfully amplified [73].Lo et al.[72] reported that DNA fragments shorter than 121 bp could be amplified while longer sequences could not in pulverized herbs boiled for 120 min.Hence,Yeo et al.[74] first solved the problem by proposing “DNA minibarcodes”,namely,shorter molecular markers(generally 100-300 bp)that can be used to identify objects with varying degrees of DNA degradation.In the test of 30,000 specimens covering 5,500 species,mini-barcodes showed no significant differences from fulllength DNA barcodes in terms of species identification performance.The authentication of herbal ingredients in decoctions or CPMs is critical to guarantee that the public uses medicine safely and effectively.The short markers can also be amplified and sequenced from different types of CPMs,which contributes to the quality control of compound preparation [75].Wang et al.[76]investigated 12 batches of commercially available CPMs using a 159 bp DNA barcode and found that in two products,the main monarch drug was partially or even completely adulterated with unlabelled ingredients.Furthermore,Han's group[77,78]proposed extremely short nucleotide signatures normally less than 50 bp for the authenticity detection of specific herbal ingredients in CPMs and the diagnosis of food poisoning in forensic science.Nucleotide signatures are developed for particular species or genus groups and are highly conserved and free of variation within the corresponding taxon.The application of DNA mini-barcodes in the identification of Chinese medicinal materials in the last 10 years was overviewed in Table 1 [77-93].With the advantages of a high success rate of amplification and sequencing,DNA mini-barcoding undoubtedly has bright application prospects in the detection and identification of biological samples.

Table 1 The application of DNA mini-barcode in the identification of Chinese medicinal materials in recent 10 years.

4.2.Classification and application of DNA mini-barcodes

4.2.1.UniversalDNAmini-barcodes

Universal DNA mini-barcodes originally included short molecular markers derived from conventional barcode loci such as ITS,ITS2 andCOI.They are applicable to the identification and differentiation of most species,and short fragments show a higher success rate of amplification and sequencing,especially for DNAdegraded materials [94,95].Compared with the 97% resolution of the full-lengthCOIbarcode (~650 bp),the recognition success rate of 100 bp and 250 bp sequences within theCOIregion can reach 90%-95% [96].Song et al.[79] revealed that the primary nuclear locus ITS2 and four chloroplast loci,thepsbA-trnH,rbcL,matK,andtrnL(UAA) introns,could be successfully amplified in only 8.89%-20% of 45 processed herbal samples belonging to 15 species.In contrast,the mini-barcodetrnL(UAA) p6 loop can achieve the successful amplification in 75.56% of processed medicinal materials.In general,regions with high variation are optimal candidates for mini-barcodes that distinguish as many different species as possible.Mini molecular markers have been widely used in the quality inspection of food,food supplements,processed herbs and CPMs[97].For example,the16SrRNAof approximately 200 bp was recommended in a test of animal-derived products,and 23%of the 52 samples were found to contain undeclared species [98].Six primer pairs targeting the short fragments (127-314 bp) ofCOIregions were developed to test 44 processed fish products,93.2%of which could be identified at the species or genus level [99].Short barcodes are also suggested to ensure the accurate use of medicinal resources.TherbcL-based plant DNA mini-barcode was applied in the quality control of herbal formulations,and some non-listed ingredients were detected in the preparations [100].This undoubtedly provides strong support for the detection and regulation of species substitution in processed products in the commercial market.For some related species,common mini-barcodes sometimes cannot provide sufficient genetic information and mutation sites to distinguish them.In such cases,other new loci are needed[80,81].Mini-barcodes located in the hypervariable regions of organelle genomes,such as cp-genomes and mitochondrial genomes,have been developed as auxiliary tools[82].Dong et al.[80]provided a cp-genomes based strategy for designing taxon-specific DNA mini-barcodes without lowering discrimination power,andycf1a(60 bp),ycf1b(100 bp)andrps16(280 bp)were identified as candidate barcodes for thePanaxgenus.In conclusion,DNA minibarcodes present great advantages in the detection and authentication of DNA-degraded materials.

4.2.2.Species-specificnucleotidesignature

The nucleotide signature was initially proposed for raw herb detection in proprietary Chinese medicines,including powders,crude extracts,capsules,pills,tablets and syrups.Specific short fragments were used to test the authenticity of herbal ingredients in CPMs with complex components that had undergone a series of processes,such as crushing,boiling and frying,which could help prevent accidental misuse and intentional adulteration and substitution,especially for confusable,precious or endangered medicinal resources [77].Compared with universal barcodes,nucleotide signatures are more specific and unique to some individuals.According to different application scopes,they can be divided into specieslevel and genus-level nucleotide signatures.The medicinal plants for which nucleotide signatures have been designed includeAngelicasinensis,Ophiopogonjaponicus,Schisandrachinensis,Panax quinquefolius,andP.notoginseng[77,83,101,102].The abovementioned molecular markers combined with specific primers have been successfully applied in adulteration detection of different compound preparations.Wang et al.[84]developed four nucleotide signatures (30-37 bp) to detect common adulterants ofCistanches Herbaand found that up to 36.4% of the 66 tested commercial products were adulterated.The double-peak method combined with a 34 bp nucleotide signature was used to evaluate the botanical extracts and complex CPMs labelled as containingLoniceraspecies.Only 17% of the extracts and 22% of the CPMs appeared to be authentic,while adulteration and substitution appeared in the rest[85].In addition,Wang et al.[78,86] developed a genus-level nucleotide signature forAconitumandEphedra.Compared withthe nucleotide signature designed for special species,a nucleotide signature at the genus level is particularly suitable for the detection of a genus containing certain characteristics.In addition,the combination of nucleotide signatures and HTS technology aided in the identification and detection of toxicAconitumingredients in unknown mixture samples,which provided a new perspective and strong support for the diagnosis of food poisoning.

4.3.Advantages and limitations of mini-barcodes

We will first discuss the advantages of mini-barcode markers and then discuss their drawbacks.Mini-barcodes can be used as a powerful tool to obtain barcode information from specimens after complex processing or long-term storage.In addition to enabling amplification from degraded DNA,DNA mini-barcodes are also more convenient to combine with other detection techniques,such as high-resolution melting (HRM),loop-mediated isothermal amplification(LAMP)and Kompetitive Allele-Specific PCR markers.Mini-barcode regions can also be easily retrieved via HTS.Minibarcodes combined with HTS can identify multiple species with the maximum recovery of amplicons from processed products containing complex ingredients [97,103,104].Moreover,the nucleotide signatures can be retrieved directly from raw HTS sequencing data without assembly and annotation,which helps avoid tedious data processing procedures and saves much time [86].However,the main drawback is the relative difficulty in designing universal primers compared with that for normal barcode regions (ITS2,psbA-trnH,etc.) when working with degraded DNA.The amplification and sequencing of mini-barcodes often require the design of specific primers,which also reduces the universality of this method[105].Second,the reduced size of mini-barcodes with less genetic information and variation sites inevitably causes a decrease in species resolution[106].Some barcode markers are applicable only to particular taxa,and the addition of other individuals may interfere with the analysis results.A single locus may be insufficient to distinguish different species,and multiple barcodes are needed to guarantee accurate identification [107].In conclusion,conventional DNA barcodes with high accuracy are preferred for the identification of fresh or well-preserved samples,while minibarcodes perform better for DNA-degraded targets.Thus,minibarcodes will extend the molecular authentication of herbal products.

5.Cutting-edge barcoding-based technology for herb identification

Despite the accuracy of DNA barcodes for species identification,many researchers think that alternative methods of identifying herbal species should be developed using barcode information or other molecular markers without gel extraction or sequencing.For example,using sequence characterized amplified region (SCAR)markers derived from DNA barcodes or discriminative SSR sequences,amplification of species-specific sequences can be visualized and used to distinguish herbal medicine via conventional PCR and real-time PCR.Compared to conventional PCR,real-time PCR is more sensitive and faster.The use of quantitative real-time PCR (qRT-PCR) for herb identification is widespread,and processed herbs containing small amounts of DNA can be intensively analyzed by real-time PCR.HRM analysis combined with DNA barcoding or molecular markers is a recently developed post-PCR approach that can be used to identify herbal medicine species in a real-time PCR machine.Currently,novel digital PCR (droplet digital PCR (ddPCR)) enables highly accurate,direct quantification without the need for a calibration curve[108].

5.1.High-resolution melting technology and ddPCR

The Bar-HRM technique includes DNA barcoding combined with HRM analysis.Through HRM technology,small differences between amplification can be detected via direct melting of nucleic acid samples.Detection of these variations is subject to sample composition,the DNA barcoding fragment used,sequence length,GC content,and the complementarity of the strands[109-111].The first step of this new approach is region-specific PCR amplification(ITS2,psbA-trnH,rbcL,matK,ITS,etc.) [112,113] of the region of interest using a dsDNA binding dye specially designed for doublestranded DNA (dsDNA) [114-116].HRM analysis is conducted in one ‘closed tube’,and the results are shown.Because of these merits,the technique is extensively utilized in the pharmaceutical industry and market for herbal medicine to analyze raw materials,detect adulterants,and check the quality of herbal goods[117,118].The advantages of the rapid authentication through HRM were first reported forSideritisspecies,and the approach was conducted on the basis of the ITS2 region [119].Another study team used HRM analysis and the chloroplast DNA regionpsbA-trnHto differentiatePanaxnotoginsengfrom adulterant species [120],andAristolochia manshuriensis,saffron (Crocussativus),and mutong (Akebiaquinata)were all identified using the same procedure[112,121].These investigations showed that by using HRM analysis,original species may be distinguished from adulterant species.The limitation of Bar-HRM analysis is its inability to detect closely related species with little genetic variability.To overcome this issue,Osathanunkul et al.[122] created a particular mini-barcode for therbcLgene to distinguish three medicinal plants in the Acanthaceae (Acanthus ebracteatus,AndrographispaniculataandRhinacanthusnasutus).According to Singtonat and Osathanunkul[123],the four DNA minibarcodesmatK,rbcL,trnL,andrpoCmay be used to identify herbal materials originating fromThunbergialaurifolia.Some studies reported that Bar-HRM was able to detect one herb adulterant in another herbal product at a frequency as low as 1%,especially under the mixing of toxic medicinal materials,such as Armeniacae semen amarum mixed in Persicae semen andCrotalariaspectabilisadulterants inThunbergialaurifolia[123,124].A number of case studies and technological innovations suggest that Bar-HRM is an effective strategy for testing adulteration and poisonous materials that may occur in cases of commercial fraud,as well as for confirming the identity of raw and processed medicinal products.

ddPCR was invented for absolute quantification of target DNA copy numbers based on limiting dilution,PCR,and Poisson analysis to overcome the limitations of conventional qPCR tests using Ct values and standard curves [125,126].With this method,the reaction mixture is divided into hundreds to millions of compartments,and the amplification is then detected in real time [127].ddPCR is now extensively employed for low-abundance nucleic acid detection and disease diagnosis[128,129],detection of transgenic plants[130,131],analysis of food [132,133] and identification of meat species [134].Recently,herb species have been analyzed qualitatively and quantitatively using ddPCR assays.Duplex ddPCR has been used for the identification and quantification ofPanaxnotoginsengpowder and its adulterants as well asCrocussativus[132].

5.2.Isothermal amplification technology

The amplification of barcoding or other molecular markers in herbs using molecular biology technology can be divided into two categories:non-isothermal and isothermal.The major advantages of non-isothermal barcoding amplification and PCR are its rapidity,high sensitivity,and specificity.However,the PCR assay requires expensive equipment and/or much time.In comparison with PCR,isothermal amplification methods have shorter reaction times and do not require professional equipment.Herbal species testing is being carried out using isothermal molecular technology in addition to PCR.There are multiple methods of isothermal amplification utilized for herb identification,including helicase-dependent amplification (HAD),LAMP,and recombinase polymerase amplification (RPA).As a result of the excellent performance of these isothermal amplification techniques,commercial kits such as LAMP(Eiken,Japan),RPA(TwistDx,UK),SDA(Becton Dickinson,USA)and HAD (SAMBA,UK) kits have been developed for amplifying nucleotide acids at constant temperature.HDA was the first isothermal amplification technique used to achieve on-site authentication of ginseng,in which template DNA strands are released by a helicase instead of heat[112].

5.2.1.Recombinasepolymeraseamplification(RPA)

Piepenburg et al.[135]first proposed the RPA reaction,which is isothermal amplification involving recombinant enzyme(T4 uvsX),polymerase(Bsu)and single chain binding protein(SSB)as well as two specific upstream and downstream primers.The technique includes looking for homologous sequences in double-stranded DNA by utilizing protein-DNA complexes created by recombinant enzymes and primers.Once the primers locate the homologous sequence,a chain exchange reaction is formed,and DNA synthesis is initiated to exponentially amplify the target region on the template.The replaced strand binds to the SSB,preventing further replacement.In this system,two linked primers start a synthetic event.The overall procedure is very fast,with detectable levels of amplified products typically obtained in less than 10 min.Recombinase can be employed in the RPA method to generate a comprehensive isothermal profile by using certain primers and a probe[136].The RPA display system uses lateral flow dipstick(LFD)detection technology to make it easier to provide a visual result[137].Several pathogens,including viruses,have been detected using the LFD method,due to its sensitivity,specificity,and visual ability [138,139].According to recent reports,three types of medicinal plants have been discriminated using RPA technology.Liu et al.[140]first used RPA technology to discriminateGinkgobilobaand its adulterantSophorajaponica.Zhao et al.[141] reported the rapid and visual distinction of saffron from its adulterants utilizing minimum apparatus in combination RPA with LFD process.They believed that this technique would work with fresh,dried,and processed materials.Zheng et al.[142] presented a procedure for rapidly differentiating between the poisonous plantG.elegansand the therapeutic plantL.japonicaby combining DNA extraction from filter paper and RPA-LFD detection.

5.2.2.LAMP

LAMP technology has been employed for the detection of various pathogens relevant to human disease [143],meat identity,and genetically modified organisms(GMOs)since the beginning of its development.This cutting-edge molecular biology approach makes testing quick and precise,particularly in environments where the infrastructure required to support PCR facilities is inadequate.A set of four to six primers is used,each of which is hybridized to a different part of the target DNA sequence using the DNA polymerase fromBacillusstearothermophilus.After byproducts of amplification are generated,positive PCR results are detected by a change in colour from transparent to vibrant green or via observation of turbidity with an increasing amount of magnesium pyrophosphate.The LAMP approach does not require specialized equipment or molecular methods.LAMP has been used to determine species in herbal medication.Sasaki and Nagumo[144] originally created a LAMP-based technique that focused on thetrnKgene sequence to confirm the distinction betweenCurcumalongaandC.aromatica.With a better inter-species resolution of DNA barcodes,especially the ITS2 sequence,LAMP is a fast strategy for the identification of herbs.Multiple alignments of genuine and adulterant ITS2 sequences and this sequence can be used to design genuine-specific LAMP primers for authentication ofCrocussativus[145],Dendrobiumofficinale[146],Hedyotisdiffusa[147],andTaraxacumformosanum[148].Additionally,LAMP can be used in combination with barcoding analysis to detect the toxic additiveAristolochiafangchi[149].

5.3.Meta-barcoding for mixed herbal medicine

A substantial advantage of DNA barcoding is its good ability to discriminate within and between species.However,firstgeneration sequencing has inherent limitations; it is not appropriate for identifying species in mixed samples and was previously exclusively used on samples comprising a single species.A versatile and precise multiplex species quantification method called vector control quantitative analysis (VCQA) was created by Zhao et al.[150]for use in herbal formulations.With the rapid development of HTS technology,shotgun and PCR-product meta-barcoding based on the genetic code of species has been effectively employed for herb identification in complex mixtures [151,152].The Illumina platform was used to successfully and extensively sequence ITS2 products from complex mixture samples,and the information in the nucleotide ITS2 sequence after simple assembly determined the botanical ingredients of herbal preparations such as Qingguo Wan and Wuhu San [153].Long-read sequencing technologies,such as single-molecule real-time (SMRT) sequencing provided by Pacific Bioscience (PacBio),are superior to short-read sequencing technologies in that they can produce longer reads of several kilobases with a distinctive circular template,enabling each template to be sequenced numerous times to produce a CCS.For this reason,PacBio sequencing has been successfully applied to directly sequence barcoding templates for identifying species in mixed herb systems[154,155].

6.Conclusions and future perspectives

Conventional DNA barcoding has been significantly utilized in species identification of herbal medicine over the past decade.Recently,some genome-based studies have focused on population-level resolution and a single individual per species,even determining the geographic origin of samples.Such studies are needed for the quality monitoring of medicinal materials.In addition,HTS provides affordable big data to perform low-level taxonomic studies.Future studies and application of DNA barcoding may shift from inter-species to germplasm and interregional identification with the accomplishment of genome fine mapping,especially for entries in the Medicinal Plant Genome Database,which will make it an efficient tool for geo-herbalism research in the field of herbal medicine.

To avoid herbal misidentification and artificial adulteration and promote the healthy and sustainable development of the herb industry,further work on DNA barcoding in herbal medicine identification can be carried out in the following contexts: 1)continuously updating and improving DNA barcode databases to cover as many herbal varieties as possible; 2) promoting the sequencing of the whole genome and organelle genome of medicinal plants to provide a solid base for molecular marker selection,assisted breeding practice and phylogenetic analysis; 3)developing new specific gene loci for closely related medicinal species that are easily confused and are difficult to distinguish by universal barcodes to ensure the accuracy of clinical drug use; 4)combining DNA barcoding with physical and chemical analysis to achieve comprehensive quality evaluation of medicinal materials;5) developing reliable and fast on-site molecular techniques integrating DNA barcodes guided by market requirements; 6)incorporating HTS technology and DNA barcoding technology for species quantification in mixed medicinal materials and undertaking qualitative work on samples with high levels of damage;and 7) further introducing DNA barcoding technology to the whole traditional Chinese medicine industrial process,including seeds,seedlings,medicinal materials,decoction pieces and CPMs,for whole-process supervision and traceability.

CRediT author statement

Shilin Chen:Writing - Reviewing and Editing,Funding acquisition;Xianmei Yin:Investigation,Visualization;Jianping Han:Data curation,Writing - Original draft preparation;Wei Sun:Visualization,Writing - Original draft preparation;Hui Yao:Visualization,Supervision;Jingyuan Song:Writing - Reviewing and Editing;Xiwen Li:Data curation,Writing - Original draft preparation,Funding acquisition.

Declaration of competing interest

The authors declare that there are no conflicts of interest.

Acknowledgments

The research was funded by the National Natural Science Foundation of China (Grant No.: U1812403-1) and the China Academy of Chinese Medical Sciences Innovation Fund(Grant No.:CI2021A03910).