APP下载

Research progress of data mining in the treatment of hypertension by traditional Chinese medicine

2021-05-27LuZhaoYuJieWuMingQuanZhang

Food and Health 2021年2期

Lu Zhao,Yu-Jie Wu,Ming-Quan Zhang*

1Basic Medical School of Hebei University of Chinese Medicine,Shijiazhuang 050200,Hebei,China.

Abstract With the gradual development of data mining technology,more and more data mining software emerges as the times require,and the data mining methods are diversified,which provides a strong support in the study of the treatment of hypertension by traditional Chinese medicine.This research systematically introduces the advantages,disadvantages and application examples of data mining software such as SPSS,TCM inheritance support platform,TCM clinical research information sharing system and data mining methods such as cluster analysis,bayesian network,system evaluation.It is expected to enhance the practical application of data mining in the study of prevention,diagnosis and medication rule of TCM treatment of hypertension,and provide reference for the development of new software and new technology.

Key words:Data mining,Statistical methods,Hypertension,Traditional Chinese medicine

Background

Hypertension is a common cardiovascular and cerebrovascular disease in clinic,and it ranks first in the chronic diseases that affect the health of Chinese[1].In recent years,traditional Chinese medicine has shown remarkable effect in the prevention,diagnosis,treatment and prognosis of hypertension with its unique advantages.There is certain regularity in the clinical medication rule and academic thought of the doctors in the past dynasties.The literature on the clinical research of traditional Chinese medicine in the treatment of hypertension and the experience of ancient and modern doctors in the treatment of hypertension is increasing year by year.With the continuous development of data mining technology,the research academic experience model has developed from simple case report,classical medical record summary to the construction of structured case database.Comprehensive use of a variety of data mining methods to name TCM syndrome differentiation methods,diagnostic ideas,medication characteristics to carry out a more in-depth,multi-level exploration [2].A variety of data mining software and research methods came into being,this study is summarized as follows.

Major data mining software

Microsoft Excel,SPSS and SPSS series

Excel exists in almost every computer and is an essential statistical software for scientific research workers.It is easy to use and is the best choice for beginners.Excel is not as powerful as professional statistical software such as SPSS,but it has rich functions of calculation,statistics and charts,which can complete the basic analysis of biostatistics,such as t test,regression analysis,descriptive statistics,analysis of variance and so on [3].

First developed software for statistical analysis is SPSS,it can perform statistical analysis operations,data mining,prediction analysis and decision support tasks,and is widely used in scientific research statistics.The software operation is relatively simple,and powerful,can provide the exploratory data analysis,statistical description,contingency table analysis,two-dimensional correlation,rank correlation,partial correlation and analysis of variance,non-parametric test,multiple regression,survival analysis,analysis of covariance,discriminant analysis,factor analysis,cluster analysis and nonlinear regression,Logistic regression,etc [4].The biggest difference between SPSS and Excel is the statistical function.Excel only has a few built-in simple statistical functions,but SPSS can complete professional statistics,and the operation is simple,all operations are carried out with a unified window mode,the general steps are to select variables,set statistical parameters,output results,users even do not need to know too much statistical expertise,with the default settings can be.

Tian et al.[5]conducted a retrospective study on the clinical characteristics,TCM syndrome types,prescription medication and other rules of 2064 patients with hypertension.She imported the inpatient information into Excel and used SPSS to conduct descriptive analysis and baseline analysis on the basic information of patients.T-test and ANOVA were used for measurement data,chi-square test was used for enumeration data,and association rule analysis and network graph analysis based on Apriori algorithm were used for Chinese herbal medicine prescriptions.It was found that blood stasis,phlegm and deficiency were the basic pathological factors of hypertension,and the drugs of activating blood circulation and dredging collaterals,calming liver and tonifying kidney were commonly used for the treatment of hypertension.Ligusticum wallichii was the core drug,and according to different syndrome types,it was combined with gastrodia elata,yam,angelica sinensis and pinellia ternata to enhance the efficacy.

Wei et al.[6]used Excel to generate random numbers for elderly hypertension patients from 3 communities in a district of Shanghai,and adopted multi-stage stratified sampling,and finally included 808 patients.Application of SPSS for descriptive statistics,chi-square test,correlation analysis and Logistic regression analysis,draw the conclusion:the community elderly hypertension patient health is given priority to with virtual,such as qi deficiency,Yang deficiency and Yin deficiency.The function of the viscera in the elderly is easily disordered,which will cause the abnormal transportation of qi,blood and body fluid,lead to the accumulation of water and dampness,and form phlegm and dampness constitution.Therefore,in the treatment of elderly patients with traditional Chinese medicine,we should consider the special physique of elderly patients,and according to sex,smoking,whether to take antihypertensive drugs and other factors comprehensive judgment.

Excel in learning and use are very simple,easy to use,but limited function.The advantage is that his built-in functions are numerous and easy to operate.Formula,graphics and so on can change immediately with the change,this characteristic can not be realized in SPSS at present.SPSS function is complete,the operation is much simpler than the computer programming software,and it is the first choice software in the aspect of ANOVA.It can carry on the function of many,multivariate analyses and mixed model analysis.It occupies an important position in medical statistics.

Traditional Chinese medicine inheritance assistance platform

Traditional Chinese medicine inheritance assistance platform software (V2.5) is an upgraded version of traditional Chinese medicine inheritance assistance system (V1.0),developed by the Chinese Academy of traditional Chinese Medicine.It has six systems:clinical collection,knowledge retrieval,data management,platform management,statistical report form and data analysis.The information of four diagnoses of TCM can be collected in the clinical acquisition system;Platform management system is divided into data management,system management department management;A data management system enables rapid query and management of data;The knowledge retrieval system can inquire Chinese medicine,prescription and medical case.Accounting system is divided into prescription statistics and medical record statistics.Prescription statistics include basic information statistics,chemical composition statistics,pharmacological action statistics.Medical record statistics including patient patient information statistics,treatment effect statistics,symptom statistics,syndrome statistics,treatment rule statistics,drug frequency statistics,flavor and Meridian tropism statistics,drug dosage statistics,TCM disease statistics and western medicine disease statistics.The data analysis system is also divided into prescription statistics and medical record statistics.The function of prescription statistics includes compositional analysis of recipe,formula analysis and dosage analysis.The medical record analysis includes syndrome analysis,formula syndrome analysis and acupoint analysis.The software can also realize the network visualization of the relationship between TCM-TCM,symptom-syndrome,TCM-symptom,prescription-syndrome,and symptom-syndrome-TCM.It provides an important reference value for inheriting the clinical experience of traditional Chinese medicine and the research and development of new drugs [2].The software has comprehensive function,simple interface and simple operation,and is favored in the research of Chinese medicine data mining.Since 2016,when this system was first used for hypertension data mining research,19 journal papers and 30 master and doctor graduation studies have been published.

According to this system,Zhao et al.[7]analyzed the prescription of treating hypertension myocardial fibrosis in CNKI.It was concluded that the commonly used drugs were calming liver wind-calming liver wind,blood activating and stasis removing-blood activating and stasis removing,blood activating and stasis removing-calming liver wind,eight core combinations and four new prescriptions evolved.Based on this system,Chen et al.[8]analyzed 360 Chinese medicine compounds for hypertension,and obtained 14 core combinations and 7 new prescriptions through data mining.Based on the theory of "treating hypertension from the liver ",we should treat hypertension mainly by calming the liver and relieving the wind,and add the Chinese medicine of promoting blood circulation and removing blood stasis,promoting water and reducing swelling,nourishing yin and clearing heat flexibly in combination with the symptoms and signs of the patients.Hu et al.[9]screened 274 prescriptions of Professor Wang Yulin for the treatment of hyperactive liver and Yang type hypertension,and the statistical analysis results showed that cold drugs were the main drugs.One of the new prescriptions obtained by data mining includes rehmannia glutinosa,Alisma alisma,bark peony,yam,uncarotene,scutellaria baicalin,Achyranthes acuminata,Eucommia ulmoides,etc.,which is also the basic prescription of Professor Wang in the clinical treatment of hyperactive liver and Yang type hypertension.It can be seen that the data mining results obtained by the application of this software have certain reference value.Of course,in the treatment of hyperactive liver and Yang type hypertension,the method of clearing the liver and reducing the liver should not be completely applied,but the trend of the disease of the patients with hypertension,the damage of other viscera and the rise and decline of evil and positive factors should be considered comprehensively as the basis of the treatment.

Traditional Chinese medicine inheritance assistance platform has played a good role in summing up the law of drug use and inheriting the experience of famous doctors.However,the common drug pairs,prescription rules,core drugs and combinations and new prescriptions obtained by the clustering method and association rule method need to be verified in clinical practice.Therefore,syndrome differentiation analysis and flexible application should be carried out in combination with clinical practice,so as to better inherit the experience of famous doctors.

Clinical research information sharing system

Clinical research information sharing system aims to realize clinical diagnosis,treatment and data collection simultaneously,and achieve the collection,integration,storage,management,sharing and utilization of clinical data,so as to meet the various needs of TCM clinical research and technical support platform in the real world.The system can discover and verify the new theory,new technology and new method of TCM,so as to improve the clinical effect.The system has a variety of data mining methods,such as complex network,dynamic decision process model,support vector machine and so on.It is worth mentioning that the complex subnet filter and hierarchical core subnet mining algorithm are independently developed in this system for the study of clinical drug compatibility,which can predict the compatibility of core drugs and the rules of addition and subtraction with symptoms[10].

Wang [11]et al.screened the medical records of hypertensive patients treated and treated by Professor Lin Huijuan,analyzed the data with the Clinical research information sharing system,and obtained the core prescription of traditional Chinese medicine for the treatment of hypertension by Professor Lin.The prescription took Astragalus membranaceus,Gastrodia elata and Uncaria as the monarch,which has the function of calming the liver,latent Yang and enhancing qi.White peony,chuanxiong,coptis,angelica for the minister,tonifying blood and activating blood,clearing heat and reducing fire;Zizyphus jujube seed and pueraria root are the adjuvant to nourish the mind and calm the mind.In addition,modern pharmacological studies show that puerarin has a good antihypertensive effect.Panax notoginseng powder and rehmannia rehmanniae are used to promote blood circulation and remove blood stasis,nourish Yin and clear heat.The whole formula plays the effects of invigorating qi,regulating liver and calming liver,enriching blood and promoting blood circulation,clearing heat and nourishing Yin and tranquilizing the mind.

The system can and hospital clinical information system integration,to achieve "clinical information collection -mining extraction experience -clinical application validation mechanism mechanism research-" theory to guide clinical ladder progressive mode of inheritance,but not easy to individual users,mostly used in clinical research-oriented medical institutions to input daily diagnosis and treatment data [2].

The main method of data mining

Frequency analysis

Frequency refers to the number of times in the value of a variable that represents a certain characteristic.The total data are grouped according to certain criteria to count the number of individuals in each group.The greater the value of frequency,the greater the value of the set of markers will play on the overall level;conversely,the smaller the value of frequency(frequency),the smaller the value of the set of markers will play a smaller role on the overall level.

Shi et al.[12]collected 345 prescriptions from Professor Yang Chuanhua for the treatment of hypertension,and analyzed the frequency of the drugs in the prescriptions.The top Chinese medicines were moutan bark,angelica sinensis,scutellaria baicalensis and Ligusticum wallichii.Enriching weakness drugs appeared the most,followed by heat-clearing drugs.Drugs for blood-activating and stasis-eliminating compound,drugs for tranquilizing nerves,drugs for calming liver wind all appeared more than 300 times,and drugs for regulating qi appeared relatively few times.Professor Yang believes that chronic diseases are difficult to heal,and in the later stage,the deficiency of viscera is the main disease,and hypertension is no exception.viscera weakness for a long time will certainly damage the kidney.According to the difference of kidney-yin deficiency,kidney-yang deficiency and both yin-yang deficiency,the drug is used flexibly in clinical practice and has achieved good results.Li et al.[13]collected 275 prescriptions of Professor Huang Li and counted the frequency of drug use by SPSS.According to the frequency analysis,the high-frequency drugs were found to be Ligusticum wallichii,Radix Puerariae Radix,and Gastrodia elata,etc.It is concluded that professor Li treat senile hypertension mainly uses drugs which belongs to the liver,spleen and heart meridian,in order to calm the liver,submerge the Yang,calm the shock and calm the mind,and give attention to promoting blood circulation,regulating qi and tonifying the liver and kidney.

Frequency analysis is the most commonly used and basic analytical method.It can intuitively see the regularity of the data studied and play an indispensable role in the study of traditional Chinese medicine in the treatment of hypertension.

Cluster analysis

Clustering analysis is a method to simplify data by data modeling.It classifies samples according to the internal structure of data,including clustering algorithm,K-means algorithm,hierarchical analysis and so on.The unsupervised clustering method can avoid human subjectivity and can be used to study the relationship between Chinese medicine and traditional Chinese medicine.

Wang et al.[14]sorted out 222 prescriptions,including 218 traditional Chinese medicines,from the first batch to the fourth batch of clinical evidences and empirical prescriptions of senior Chinese medicine experts in the treatment of hypertension,and obtained 4 kinds of traditional Chinese medicines commonly used in the treatment of hypertension by cluster analysis.Chrysanthemum,prunella,radix achyranthes,uncaria,Gastrodia elata and other traditional Chinese medicines to clear heat and calm liver;Salvia miltiorrhiza,safflower,red peony root,angelica,mulberry parasitic,eucommia ulmoides,pheretima and other traditional Chinese medicine to promote qi and promote blood circulation,tonifying liver and kidney;Rhizoma alisma,poria cocos,pinellia,atractylodes macrocephala and other traditional Chinese medicines to reduce phlegm,dispelling wind,and invigorating spleen and dampness;Licorice,white peony,wolfberry,oyster,fossil fragments and other traditional Chinese medicine to nourish Yin and blood,and tranquillization with heavy prescription.Thus it can be seen that the old traditional Chinese medicine treatment of hypertension mainly to lowering liver and heart fire,clear heat and fire,nourishing the liver and kidney is fundamental,while taking into account the circulation of blood,removing blood stasis,invigorating the spleen and dispelling dampness.

Pan et al.[15]conducted data mining on Professor Xiao Changjiang's 78 prescriptions through cluster analysis,and concluded that Professor Xiao mainly treated hypertension by calming liver wind and tonifying the liver and kidney,and also by promoting blood circulation and resolving phlegm.However,because of the need for dialectical treatment in clinical practice,even if the same prescription is used,the same dosage of traditional Chinese medicine is not the same,the focus of treatment is different,plus the number of prescriptions and the incompleteness of data mining methods,in clinical practice,we should refer to the deviation of patients' physique to analyze comprehensively,so as to obtain the best therapeutic effect.Yin et al.[16]designed "Hypertension Syndrome Factor Questionnaire ",which was used to collect information of four consultations of Chinese medicine in 450 patients with hypertension five years before and later.The symptoms were analyzed by k-means method,and the syndromes were determined by clinical experience.The TCM syndromes of 450 patients with hypertension 5 years ago were divided into 306 cases (68.0%) of hyperactive liver and Yang,117 cases (26.0%) of retention of phlegmatic dampness,and 27 cases (6.0%) of deficiency of kidney essence.Five years later,the syndrome types were divided into 186 cases (41.3%) of deficiency of kidney essence,150 cases (33.3%) of obstruction of the orifices by blood stasis,and 114 cases (25.3%) of deficiency of qi and blood.Through the comparison of the syndromes before and after 5 years,it was found that the syndromes of patients with hypertension changed from solid to deficiency,the disease location changed from liver to kidney,and the pathological factors changed from phlegm and dampness to blood stasis,which was in accordance with the theory of traditional Chinese medicine of "prolonged disease involving kidney and chronic diseases transforming to collaterals".Cluster analysis is widely used in the study of TCM syndrom.When analyzing syndrom,it is classified according to the similarity of the information of the four diagnoses,which fully avoids the error caused by human subjective factors.It analyzes the commonality of each category of individuals based on individual differences,and then classifies them into a certain category with professional knowledge.Symptoms and other indicators can also be classified through index clustering to achieve the purpose of dimensionality reduction [17].

Clustering algorithm is mainly used to calculate the similarity between data in the research field of TCM treatment of hypertension,to obtain medication rule,syndrome type judgment,prescription and syndrome relationship,etc.,and to obtain effective data from a large number of medical cases.

Correlation analysis

Correlation analysis refers to the discovery of relevance or correlation in a large number of data sets,thus describing the laws and patterns of some attributes in a thing that appear at the same time,which can effectively mine the hidden knowledge in the data.More in-depth analysis of the relationship between data attributes.Apriori algorithm is one of the most famous association rule mining algorithms.Prescription data can show a doctor's drug use law,in the study is to analyze the relationship between drug properties and drug efficacy,and a prescription combination of traditional Chinese medicine plays a vital role.Association rule algorithm is often used to study the law of drug use.It can mine massive big data and explore the relationship between data,especially in the study of drug law more effective.

Wu et al.[18]collected and screened 119 traditional Chinese medicine prescriptions related to hypertension in the elderly for nearly 10 years from CNKI,Wanfang Data and VIP Paper Check System.Based on Apriori correlation analysis,29 pairs of commonly used drugs and 12 groups of commonly used drugs were excavated.The support degree of calming liver and extinguishing wind medicine suit gastrodia elata-uncaria was 31.356%;The supporting degree of eliminating dampness and eliminating phlegm medicine suits poria cocos-pericarpium citri reticulatae,pinellia-pericarpium citri reticulatae were both 22.034%;The supporting degree of liver and kidney tonifying drugs to radix rehmanniae praeparat-dogwood was 21.186%.According to the analysis results of high frequency drug association rules,hypertension in the elderly is mainly based on the deficient root and excessive superficial,should be based on invigorating spleen and tonifying liver and kidney.The treatment of this deficiency should mainly be invigorating the spleen and replenishing qi,and tonifying the liver and kidney.The palliative treatment is to calming liver wind,dry dampness and phlegm,and promote blood circulation and remove blood stasis.Zheng et al.[19]established a database of traditional Chinese medicine by CNKI collecting the literature of traditional Chinese medicine compound for hypertension "yin deficiency and yang excess syndrome" and analyzed the association rules.The results showed that there was only one group of eight-drug combination,namely Liuwei Dihuang Pill adds uncaria and radix achyranthis bidentatae.This indicates that most doctors agree on the efficacy of this combination in the treatment of hypertension with Yin deficiency and Yang excess,and it can be used as the basic prescription for the treatment of hypertension with yin deficiency and yang excess syndrome.

Association rule algorithm directly combines patient symptoms with drugs in the prescription and does not consider the relationship between symptom characteristics and drug properties.If there are a large number of candidate sets,the efficiency of Apriori algorithm is quite low,which will produce a large number of useless rules,even can not be processed,and the type of input data is also required.Therefore,in the analysis of association rules,software such as traditional Chinese medicine inheritance assistant platform adopts an improved association rule extraction algorithm based on Apriori algorithm,which greatly improves the efficiency.

Factor analysis

Factor analysis is based on the correlation between the size variables are grouped,the variables within the same group correlation is higher,low relevance between the different groups of variables or irrelevant,each group of variable represents a common factor,is the dimensionality of information enrichment method,generate new variables are represented most of the information and the original variables are independent of each other,can be used in the subsequent regression analysis,discriminant analysis,cluster analysis,etc.In the study of traditional Chinese medicine data mining,factor analysis is mainly to reduce the dimension of symptoms and signs,classify the symptoms with internal relations to form the main common factors,and then combine with professional clinical knowledge to extract the main syndromes,which can eliminate human subjectivity and simplify the data.

Hou et al.[20]collected and analyzed the four diagnosis information of 2943 patients with hypertension.A total of 10 common factors could be extracted.The relevant literature and clinical diagnosis and treatment experience were consulted.The main syndromes of hypertension patients were yang excess,phlegm turbidity,blood stasis,qi deficiency,yin deficiency and yang deficiency.Disease location factors is liver,spleen,kidney and heart.

Yu et al.[21]analyzed the syndromes of 223 cases of hypertension,and found that the main syndromes of local hypertension patients were yin deficiency,qi deficiency,phlegm and blood deficiency.The patients with hypertension in this area can be treated with TCM intervention.

Yao et al.[22]collected the TCM constitution characteristics of 120 patients with refractory hypertension,and the factor analysis results showed that male patients were mainly phlegm-dampness constitution and damp-heat constitution.Female patients were mainly qi-stagnation constitution,phlegmatic hygrosis constitution,blood stasis constitution and yin deficiency constitution.Even the same sex,in different age stages of TCM physique characteristics are not the same.Physique affects the occurrence,development and prognosis of the disease.In view of this,in the early intervention of traditional Chinese medicine,we should take into account the bias of taken into account in the early intervention of TCM.According to the conditions of the people,the same disease has different treatment.

Hypertension is mostly intermingled deficiency and excess.There is no uniform standard and understanding of TCM syndrome and syndrome element of hypertension in TCM.Factor analysis can help us grasp the main syndromes and syndromes in clinical diagnosis and treatment,thus achieving twice the result with half the effort.

Principal component analysis

Principal component analysis (PCA) is a multivariate statistical analysis method that selects a small number of important variables through linear transformation.PCA in the study of TCM syndromes law is mainly embodied in reducing the dimensions of multiple symptoms and comprehensively analyzing their syndromes classification.

Zhang et al.[23]applications such as PCA come to Baotou pastoral area residents age,family history of high blood pressure,age,drink salt tea frequency major risk factors for hypertension,Therefore,hypertension patients in this area could reduce risk factors by reducing alcohol consumption and frequency of drinking salt tea,and should abstain from drinking alcohol if necessary.

Wang et al.[24]took 1000 patients with hypertension as the research object,and through PCA,concluded that chest tightness,head and body tightness,abdominal distension and tightness,loose stools,thin and fat tongue with tooth marks,greasy white moss,and pulse or slippery as the main points of syndrome differentiation of excessive phlegm-dampness in hypertension.The key points of syndrome differentiation of hyperactivity of liver-yang are rotation of vision,light top-heavy foot,dry or bitter mouth,soft waist and knee,red tongue,yellow moss,and number of pulse.Chest pain,purple tongue dark with petechia,thin white moss and astringent pulse are the main points of syndrome differentiation of blood stasis and obstruction of collateral.The key points of the syndrome differentiation of deficiency of qi and blood are yellow complexion,weakness,palpitation,weak tongue and heavy and thin pulse,which provide a reference for the study of the syndrome diagnosis standard of this disease.

PCA does not require the data to be normally distributed,and the original variables can be synthesized and simplified,which can objectively determine the weight of each index and avoid the randomness of the supervisor's judgment.However,principal component analysis is only suitable for data with strong correlation between variables,and there is a small amount of information loss after dimension reduction processing,which can not contain all the original data.

Logistic regression analysis

Logistic regression is a dichotomous algorithm,which only needs to calculate a linear function and is a linear model.It is mainly used to explore the regression process of multiple independent variables on classified dependent variables.It can be used to analyze the relationship between disease and risk factors.

Liu et al.[25]performed Logistic regression analysis on TCM constitution of patients with hypertension that induced cardiovascular disease,and the results showed that Yin deficiency,Qi deficiency,Yang deficiency,blood stasis,phlegm dampness(P<0.05)are high risk factor of cardiovascular disease in patients with hypertension;Gentle temperament(P<0.05)is a protective factor.According to the constitution of Chinese medicine to develop Chinese medicine conditioning technology,this method is helpful to prevent hypertension patients induced cardiovascular disease.

Zhang et al.[26]used single factor Logistic regression and multiple factors Logistic regression to analyze Wenzhou residents hypertension prevalence rate and its influencing factors,single factor Logistic regression analysis showed that age,sex,occupation,marital status,smoking,drinking,central obesity,salted,diabetes and dyslipidemia may be hypertension risk factors (P<0.01),education level may be a protective factor for hypertension (P<0.01).Residents with low education level lack understanding of unhealthy lifestyle and are more prone to bad habits.Therefore,it is necessary to carry out hypertension health education in Wenzhou to reduce the incidence of hypertension.With the improvement of people's living standard,unhealthy lifestyle makes cardiovascular and cerebrovascular diseases tend to be younger.A number of Logistic regression analysis results show that the prevalence rate of young people is high,so the age range of prevention and control of hypertension can not be limited to the elderly,we should start from young people.For the physique group which is easy to induce hypertension disease,it is necessary to prevent it before it is ill.Once it is found that the blood pressure rise should be treated and treated in time to avoid other complications,increase the burden of disease and reduce the quality of life.

Complex network

Theologies such as "holism concept" and "treatment based on syndrome differentiation" reflect the integrity and complexity of TCM theory.Most data mining technologies only mine the surface rules of data,the deep relationship of TCM theory is very complex,so it is necessary to use complex network technology to dig the deep level of TCM syndrome differentiation demonstration and prescription medication rules [27].

Lu et al.[28]selected a total of 1209 TCM prescriptions for hypertension in the outpatient department of nephrology within 2 years.According to the four main syndrome types like hypofunction of both the spleen and the lung,spleen deficiency and blood stasis,liver and kidney Yin deficiency,spleen deficiency and dampness-heat,the complex network combined with BK algorithm was used to mine the core prescriptions,and 10 TCM prescriptions in each group were obtained,all of which included atractylodis.Many contemporary doctors believe that spleen deficiency is the pathogenesis of chronic kidney disease,as well as the pathogenesis of proteinuria in hypertensive nephropathy.Atractylodes is the holy medicine for invigorating the spleen.

Wu et al.[29]used complex network to analyze the common TCM syndrome types and prescriptions of essential hypertension since 2001,and concluded that the most common TCM syndrome types were 6,namely,hyperactivity of liver and Yang,hyperactivity of Yin deficiency and Yang deficiency,excessive accumulation of phlegm and dampness,deficiency of Yin and kidney,deficiency of Yin and Yang,and hyperactivity of liver fire.Thus it can be seen that the location of hypertension tends to the liver and kidney.

Complex network studies mostly belong to retrospective literature analysis,and there is no uniform standard for the accuracy of syndrome differentiation among doctors,and the reliability of the conclusions needs to be verified in clinical studies.

Decision tree

Decision tree is a data mining method that classifies data from top to bottom step by step.It can show the classification model most intuitively,with high speed,high efficiency and easy to understand.

Tian et al.[30]established a diagnostic model of phlegm-damp in hypertension by using decision tree and neural network methods based on the information data of TCM clinical four diagnoses of 926 patients with hypertension.He concluded that the diagnostic rules were head heavy dizziness,nausea,excessive sputum and salivation,chest tightness,white and greasy moss,mental tiredness and puffiness,with an accuracy of 93.74%,which provided the basis for the TCM syndrome specification of phlegm-damp in hypertension.

Decision tree model has the advantages of clear,intuitive and automatic induction recognition,and plays an important role in processing large sample data of hypertension.However,it also has some limitations,such as the discontinuity of data types.

Artificial neural network

Artificial neural network (ANN) is an algorithm model of distributed parallel information processing based on the structure and function of animal neural network,which takes neuron as the basic operation unit.

Zhang et al.[31]established a TCM risk early warning model for hypertension through artificial neural network.Among the input variables,the continuous variables were BMI,respiration and GLU.Classification variables were family history of hypertension,slightly acid,slightly light,thick tongue coating,and strong vein.Among the output variables,the dichotomous variable was whether to have hypertension or not.The BP ANN was constructed with 8 neurons in the input layer,20 neurons in the hidden layer and 1 neuron in the output layer.The BP model could accurately identify 133 patients with hypertension and 67 patients with non-hypertension.It was concluded that the risk of hypertension was related to dietary taste.The risk of hypertension was lower if the taste was slightly sour or light,but higher if the taste was salty or pickled.From the point of physique,the risk of Yang deficiency is low,while the risk of phlegm-dampness,qi-deficiency and qi-stagnation is high.From the perspective of tongue and pulse,the risk of sinking pulse is low,while the risk of string and hard pulse is high.In addition,the classification sensitivity,specificity,Youden index,consistency and other indicators of BP ANN model and Logistic regression model are compared,and the former is slightly better than the latter in predicting ability.

The classification sensitivity,specificity,Youden index and consistency rate of artificial neural network model were higher than those of multivariate Logistic regression model.When the study sample size is limited,the prediction model obtained also has some limitations,and the processing of high-dimensional variables also has limitations,so the popularization of artificial neural network needs further research.

Bayesian network

Bayesian network,also known as belief network,causality network and so on,is a directed acyclic graph,which is a graphical pattern to describe the dependence of random variables.It is based on the bayesian formula.using bayesian network for parameter estimation,the statistical characteristics of the parameters can be described reliably.In the field of traditional Chinese medicine research is mostly used to develop expert diagnosis and treatment system,but also can be used in TCM syndrome modeling,mining syndrome rules and so on.

Lv [32]collected the medical records of 124 patients with hypertension to establish a database,and used three data mining classification methods such as Bayesian network to classify the clinical cases of hypertension respectively.Among the three algorithms,Bayesian network has the shortest time and the highest accuracy.

Wang [33]divided diabetes mellitus complicated with hypertension into three levels according to the diagnostic grading standard of hypertension.Each level was divided into three common hypertension syndrome types:Yin deficiency and Yang excess combined with blood stasis syndrome,Yin deficiency and heat excess combined with blood stasis syndrome,Yang deficiency and blood stasis syndrome;Yin deficiency and Yang excess combined with damp-heat and blood stasis,Yin and Yang deficiency combined with blood stasis,Yang deficiency and blood stasis;Yang deficiency and blood stasis syndrome,Yin and Yang deficiency combined with blood stasis syndrome,Yin and Yang deficiency combined with damp-heat and blood stasis syndrome.Set the high frequency prescription of diabetes mellitus complicated with hypertension divided into three levels according to hypertension as the analysis point,put all the symptoms and physicochemical indexes related to the analysis point into the variables,find the causal relationship and correlation degree between the analysis point and the variables,variables and variables through bayesian network,and conclude that the relationship between diabetes complicated with hypertension syndrome and physical and chemical indexes,symptoms and prescriptions is blood stasis in the whole process of diabetes complicated with hypertension.The common syndromes are Yin deficiency and Yang excess syndrome,phlegm dampness block syndrome and Yang deficiency syndrome.The main symptoms are dry mouth,fatigue,dizziness,headache,but also chest tightness,palpitations,limb numbness,limb pain,edema and so on.Representative prescriptions are Tianmagouteng decoction,Banxiabaizhutianma decoction,Shenqimaiweidihuang Decoction,Xuefuzhuyu decoction,Shenqi pill,etc.

The core idea of Bayesian network is to decouple the problem and decompose the complex distribution,but when we deal with the goal of more complex system,the design idea of Bayesian network will encounter more and more difficulties.The traditional Bayesian network,both in structure and parameter,is more inclined to artificial design,influenced by subjective factors and has limitations.It is the trend of future development to learn parameters from data,drive network parameters and transform network structure.

Support vector machine

Similar to artificial neural network,support vector machine are learning mechanisms,but different from neural networks,support vector machines use mathematical methods and optimization techniques.When the sample data is not separable,support vector machine can use the kernel function to map a complex classification task into a linear separable problem,which is very popular in the field of traditional Chinese medicine.It is mainly used to model the information of four diagnoses,make the diagnosis of traditional Chinese medicine objective,and realize the standardization of TCM syndrome.

Xu et al.[34]analyzed the information of TCM four diagnoses in 549 patients with hypertension,and studied the classification of hypertension syndrome by using support vector machine.The results showed that the syndrom types with higher accuracy were phlegm and stasis intercombination type,Yin deficiency and Yang hyperactivity type,hyperactivity of liver fire type and deficiency of kidney qi type,while the accuracy of other types was relatively low,which indicated that the support vector machine method was highly feasible,and the support vector machine algorithm could be used to analyze the syndrome differentiation of hypertension in a large number of cases.Based on this,Xu Mingdong in modeling in addition to the four Chinese medicine diagnosis,but also added blood lipids,serum uric acid and fasting blood glucose indicators.Tested again,phlegm and stasis intercombination,yin deficiency yang hyperactivity,hyperactivity of liver fire,and deficiency of kidney qi type prediction accuracy are more than 80%,indicating that there are rules of TCM syndromes of blood lipid,blood uric acid,fasting blood glucose and hypertension.

Support vector machine can be used to study the relationship between TCM syndromes and biochemical indexes such as blood lipid,blood uric acid and fasting blood glucose in the study of TCM treatment of hypertension.In the practical application of TCM data mining,if multiple classification problems need to be dealt with,the combination of multiple second-class support vector machines or the construction of a combination of classifiers can be used to solve the problem.

Systematic evaluation and Meta-analysis

Systematic evaluation and Meta-analysis are important methods in evidence-based medicine research.Systematic evaluation refers to the systematic and comprehensive collection of existing published or unpublished clinical research data starting from a specific clinical problem,screening of literatures that meet the requirements for inclusion,and drawing conclusions through qualitative and quantitative synthesis,which can be updated with the new progress of clinical research.

Wang et al.[35]systematically evaluated pulse objectification in 1503 patients with hypertension,including 12 case-control studies involving 8 pulse meters.The results of Meta-analysis showed that there were significant differences in pulse wave h1,h3/h1,h5,w1/t between hypertension patients and normal people.To some extent,the pulse information of hypertension patients and non-hypertensives is objectively different,and the main pulse of hypertension patients is wiry pulse and slippery pulse.Luo et al.[36]included 16 literatures related to the treatment of essential hypertension by acupuncture and moxibustion.Meta-analysis results showed that the efficacy of acupuncture and moxibustion in the treatment of essential hypertension was significantly higher than that of the control group,and the effective rate was different (OR=2.74;95%CI [1.79,4.18];P<0.00001),indicating that acupuncture and moxibustion is indeed effective in treating essential hypertension,and the long-term efficacy of the acupuncture and moxibustion group is better than that of the western medicine group,and the accuracy of the overall efficacy value is increasing year by year.But it should be noted that the stability of the efficacy can not be completely guaranteed.Yao et al.[37]retrieved 46 literatures related to the treatment of hypertension by Guipi Decoction from the database,and 10 literatures were included after screening.Meta-analysis was conducted on systolic blood pressure,self-rating depression scale score and quality of life assessment scale score respectively,and the results showed statistically significant differences between the treatment group and the control group.It is suggested that Guipi Decoction is better than western medicine in reducing the degree of systolic and diastolic blood pressure,improving the state of depression and improving the quality of life.

The systematic evaluation method is innovative and has obvious advantages in dealing with different research results.Its limitations are limited sample size,unable to extract all relevant data completely,and prone to theoretical deviation.The definition of clinical end points is often not clear.

Conclusion

Data mining technology in recent years has been widely applied in the study of traditional Chinese medicine treatment of high blood pressure,the research methods are association rule,clustering algorithm,classification algorithm and so on,some traditional Chinese medicine data mining software also arises at the historic moment,greatly improving the diagnosis,syndrome in traditional Chinese medicine,prescription,and treatment of efficiency and accuracy,and the clinical experience of traditional Chinese medicine treatment of high blood pressure has a certain guiding role.At the same time,data quality problems,technical standards problems,promotion and application problems cannot be ignored.Only by accelerating the improvement of the practicability of data mining technology can we better meet the needs of the development of TCM and bring new directions and opportunities for future exploration.