APP下载

Knowledge domain and emerging trends in Alzheimer's disease: a scientometric review based on CiteSpace analysis

2019-07-17ShuoLiuYaPingSunXuLingGaoYiSui

Shuo Liu, Ya-Ping Sun, Xu-Ling Gao, , Yi Sui

1 The Fourth People's Hospital of Shenyang, Shenyang, Liaoning Province, China

2 The First People's Hospital of Shenyang, Shenyang, Liaoning Province, China

Abstract Alzheimer's disease is the most common cause of dementia. It is an increasingly serious global health problem and has a significant impact on individuals and society. However, the precise cause of Alzheimer's disease is still unknown. In this study, 11,748 Web-of-Science-indexed manuscripts regarding Alzheimer's disease, all published from 2015 to 2019, and their 693,938 references were analyzed. A document co-citation network map was drawn using CiteSpace software. Research frontiers and development trends were determined by retrieving subject headings with apparent changing word frequency trends, which can be used to forecast future research developments in Alzheimer's disease.

Key Words: nerve regeneration; Alzheimer's disease; neuroprotection; mapping knowledge domain; Web of Science; CiteSpace; neural regeneration

Introduction

In 1906, Alois Alzheimer, a German psychiatrist, described the first case of Alzheimer's disease (AD) in a 50-year-old woman. Alois Alzheimer continued to track her progress until the patient died in 1906 (Alzheimer, 1907). AD is characterized by the widespread distribution of neuronal tangles and amyloid plaques in the brain, accompanied by astrocyte proliferation, cerebral atrophy, neuronal loss, and vascular changes (Berchtold and Cotman, 1998; Fan et al., 2014).AD is clinically defined as a progressive neurodegenerative disease. Most patients have symptoms such as memory loss,executive dysfunction, and behavioral changes. Cognitive impairment gets progressively worse over the duration of the disease (Smits et al., 2011). As the global population ages,AD incidence is increasing, and this not only endangers the health of the elderly, but also places a heavy burden on the family and society. AD has therefore attracted wide attention as a focus of research.

There are two types of AD: familial and sporadic AD. Both have clinical and pathological similarities, and manifest as progressive cognitive dementia with the development of senile plaques composed of amyloid beta peptide and neurofibrillary tangles composed of phosphorylated tau protein(Lee et al., 2011; Young and Goldstein, 2012). Axonal transport defects, synaptic loss, and selective neuronal death are other cellular phenotypes that are shared by both familial and sporadic AD (Young and Goldstein, 2012).

Familial AD refers to a family with two or more generations of members suffering from AD. Traceable family members include three or more patients, and the age of onset is earlier than 60 years, which accounts for approximately 5%of all AD patients. In 1991, researchers discovered a mutation in exon 17 of the amyloid precursor protein (APP) gene,on chromosome 21, in early-onset familial AD patients.Since then, the study of AD has entered a new field of molecular genetics. Familial AD is caused by high permeability mutations in PS1, PS2, and APP genes, as well as by rare autosomal mutations. The APP protein is the basis of central nervous system function, and its roles include synapse formation, neurogenesis, axonal transport, signal transduction,and neural plasticity (Holtzman et al., 2011; Brunholz et al., 2012; Lazarov and Demars, 2012; Young and Goldstein,2012; Abud et al., 2017; Arber et al., 2017).

Sporadic AD generally occurs in patients over 65 years old. Besides age, risk factors for sporadic AD include cardiovascular disease, low education, depression, and having the apolipoprotein E4 (ApoE4) haplotype (Duncan and Valenzuela, 2017). Although there are no clear dominant or recessive sporadic AD mutations, many genetic variants have been identified and this disease has strong heritable components (Avramopoulos, 2009; Young and Goldstein, 2012).Therefore, sporadic AD has a multifactorial origin, which is partly caused by complex genetic characteristics and partly influenced by environmental factors and their interactions(Duncan and Valenzuela, 2017).

Since Alois Alzheimer reported the first case of AD in 1906, many studies have been conducted and our understanding of AD pathogenesis has made great progress; however, there is still no disease-modifying treatment for AD.

CiteSpace is a web-based Java application for data analysis and visualization (Chen, 2004). It is a unique and influential application software in the field of information visualization analysis. CiteSpace software includes co-citations,co-authors, and co-occurrence keywords (Chen, 2013),which helps to provide direction in analysis of a research area. CiteSpace has three core concepts: burst detection,betweenness centrality, and heterogeneous networks. These concepts can solve three practical problems: identifying the nature of research frontiers, marking keywords, and identifying emerging trends and sudden changes in time (Chen,2006). The main procedural steps of CiteSpace software are time slicing, thresholding, modeling, pruning, merging,and mapping (Chen, 2004), and the main source of input data for CiteSpace is the Web of Science database. CiteSpace can identify frontier areas of current research by extracting burst terms from identifiers of titles, abstracts, descriptors,and bibliographic records. CiteSpace also makes it easier for users to recognize key points by identifying nodes with high betweenness centrality (Freeman, 1978). To stand out in the visual network, the key points of the software interface are highlighted with a purple ring (Chen, 2006). A useful indicator of how different clusters are connected is by using a type of nodes with high betweenness centrality scores. In CiteSpace, betweenness centrality scores are normalized to the unit interval of [0, 1]. A node of high betweenness centrality is usually one that connects two or more large groups of nodes with the node itself in between, hence the term “betweenness”. CiteSpace highlights nodes with high betweenness centrality using purple trims. The thickness of a purple betweenness centrality trim indicates how strong its betweenness centrality is; the thicker the trim, the stronger the betweenness centrality. Based on the duality of time variables between research frontiers and knowledge bases,CiteSpace explores the dynamic mechanism of disciplinary development through time mapping from the research frontier to the knowledge base (Chen, 2006). CiteSpace II displays the development trend of a discipline or knowledge domain in a certain period of time in an intuitive visual form, and analyzes the evolution of several research frontier fields. This study investigates the research trends and causes of AD development based on CiteSpace's document visualization analysis.

Data and Methods

Data collection

The data for bibliometric analysis came from Clarivate Analytics's Web of Science Core Collection, which included SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH, BKCI-S, BKCI-SSH, ESCI, CCR-EXPANDED, and IC. The first article on AD in the Web of Science database was published in German by Heilbronner (Heilbronner, 1900). Since the 1990s, the number of AD-related studies has clearly increased. Since 1990, the SCI database has added 110,339 AD studies. To explore the latest research trends in AD, we used the Web of Science to retrieve AD-related studies published from 2015 to 2019. The index term included “Alzheimer's disease” OR “Alzheimer disease”. Thus, 30,061 studies were found, including 11,748 research originals and reviews,which contained 693,938 references. The search records were exported to CiteSpace for further analysis. Studies were downloaded on January 29, 2019. Each downloaded study included the authors, title, abstract, and descriptors and identifiers.

Inclusion criteria

Inclusion criteria were: (1) Peer-reviewed published original articles on AD, including basic and clinical research; (2) reviews on AD; (3) articles published from 2015 to 2019; and(4) articles retrieved from the Web of Science.

Exclusion criteria

Exclusion criteria were: (1) articles collected by hand and telephone; (2) articles not officially published; (3) conference abstracts and proceedings, corrigendum documents; (4) repeated publications; and (5) unrelated articles.

Quality assessment

English articles that met the inclusion criteria and were “articles” or “reviews” were included in the analysis.

Results and Discussion

Publication years and journals

In 1906, German psychiatrist Alois Alzheimer described the first case of AD. However, this description did not attract the attention of the medical community at the time. In 1910, Alzheimer's boss and research partner Emil Kraepelinz officially named this disease “AD” in the Handbook of Psychiatry. From that point on, more research into AD was published, and by 1990, a total of 300 AD manuscripts were indexed. In the 1990s, the number of research articles on AD began to increase rapidlly. In 1991, Goate et al. reported a missense mutation in the APP gene in a case of familial AD, and Strittmatter et al. discovered the AD-related gene ApoEε4 in 1993. In 1995, Games et al. established the first transgenic mice to express high levels of human mutant APP,which progressively develop the hallmarks of AD. In 1999,Vassar et al. located and cloned β-secretase, and Petersen et al. proposed the concept of “mild cognitive impairment”. The same year, Schenk et al. reported that immunization with amyloid-beta attenuates Alzheimer-disease-like pathology in the PDAPP mouse model. These important findings have promoted research into AD. Since 1990, the SCI database has included 110,339 AD manuscripts, and increasing attention has been paid to AD (Figure 1). The ten journals with the largest number of published AD studies are listed in Table 1;these can provide important references for new researchers.

Articles about AD are distributed over hundreds of journals. Journal of Alzheimer's Disease ranked first in the number of published articles (2218). It is a professional journal designed to promote the understanding of the etiology,pathogenesis, epidemiology, genetics, behavior, treatment and psychology of AD. The journal has been indexed by Web of Science since 2001, and has 6751 publications, with 114,506 citations. Its h-index is 107 and the average number of citations per article is 16.96. The second-ranked journal was Neurobiology of Aging (686 articles). This journal focuses on the mechanisms of nervous system changes with age or diseases associated with age. The journal has been indexed by Web of Science since 1980, and has 6176 publications,which have been cited 220,858 times. Its h-index is 168, and the average number of citations per article is 35.76.

Table 1 Top 10 most productive journals

Co-authorship

Studies published from 2015 to 2019 were chosen with a time slice of 1 year for the analysis, and the selection criteria were the top 50% per slice. The co-authorship network is displayed in Figure 2. The size of the circle represents the number of studies published by the author. The shorter the distance between two circles, the more cooperation between the two authors. The color of the circle represents the author of the same cluster. Blue nodes represent earlier published studies, while yellow nodes represent more recently published studies. Observing Figure 2, it is immediately apparent that many authors tend to collaborate with a relatively stable group of collaborators to generate several major author clusters, and each cluster usually contains two or more core authors. Figure 2 demonstrates that the most representative author in the field of AD is Alzheimer's Disease Neuroimaging Initiative with a total of 394 published studies, followed by David M. Holtzman and Kaj Blennow. This analysis can provide highly personalized scientific research information for other researchers.

Alzheimer's Disease Neuroimaging InitiativeI researchers collect, validate and utilize data, including MRI and PET images, genetics, cognitive tests, and CSF and blood biomarkers as predictors of AD. Institutional members come from 63 locations in the USA and Canada. The Alzheimer's Disease Neuroimaging Initiative is a group author in articles, and in 2008 their first article was included in the Web of Science.Since then, a total of 319 articles, with 7406 citations can be found in the Web of Science. Their h-index is 46 and their average number of citations per article is 23.22.

Co-institute

Two authors' institutes appear in the same article as one cooperation. CiteSpace software mainly judges cooperation based on the co-occurrence frequency matrix. Studies published from 2015 to 2019 were chosen with a time slice of 1 year for the analysis, and the top 50 most-cited or -occurring items were chosen from each slice. Figure 3 exhibits co-institutes in the field of AD. Nodes represent institutes,and the size of each node corresponds to the co-occurrence frequency of the institutes. The size of the circle represents the number of papers published by the institute. The shorter the distance between two circles, the greater the cooperation between the two institutes. Purple rings indicate that these institutes have greater centrality (no less than 0.1). University College London in the UK has published the largest number of studies (533), followed by the University of California,San Francisco in the USA (518) and the Karolinska Institutet in Sweden (508).

Co-country

Studies published from 2015 to 2019 were chosen with a time slice of 1 year for the analysis, and the top 50 most-cited or -occurring items were chosen from each slice. Figure 4 displays co-country results in AD research. The size of the circle represents the number of papers published by the country. The shorter the distance between two circles, the greater the cooperation between the two countries.

Co-occurring keywords analysis

Co-occurring keywords reflect research hotspots in the field of AD. Studies published from 2015 to 2019 were chosen with a time slice of 1 year for CiteSpace analysis. The top 50 most-cited or -occurring items were chosen from each slice.In Figure 5, nodes represent keywords, and the size of each node corresponds to the co-occurring frequency of the keywords. The color of the lines that appear together between keywords indicates chronological order: blue represents the oldest, and orange the newest. The maximum frequency was of “microglia” at 8, followed by “insulin”, “genetics”, and“neuropathology”. Most nodes marked with purple circles represent good betweenness centrality, and that these keywords are important. In other words, these nodes represent emerging trends in the field of AD, with the strongest bursts.Figure 5 was sorted by time zone to obtain Figure 6, which shows the historical process of AD research.

Document co-citation analysis

11,748 studies were analyzed using CiteSpace software. Studies published from 2015 to 2019 were chosen with a time slice of 1 year for the analysis, and the most-cited or -occurring items were chosen from each slice. A document co-citation network map is displayed in Figure 7, and contains 132 unique nodes, 615 lines, and 9 main clusters. The modularity Q was 0.5072 and the average value was 0.6617. These nodes and lines represent the relationship between references and co-citations of the collected studies, respectively. The more cited the study, the larger the node. The color and thickness of the circle in the node indicate the citation frequency at different time periods. Line colors correspond directly to the time slice, meaning that cold colors represent earlier years,while warm colors represent more recent years. For example,purple lines represent studies co-cited in 2006. Recent co-citation is visualized using yellow or orange lines. The modularity Q and subject contour are two indicators for evaluating clusters. Q > 0.3 means that the network is very important,and outline > 0.5 means that the clustering results are reasonable. The citation year ring represents the citation history of this study; the color of the citation ring represents the corresponding citation time. The thickness of an annual ring is proportional to the number of citations in a time zone.

Figure 1 Time sequence of relevant papers on Alzheimer's disease published from 1991 to 2019 in Web of Science.

Figure 2 Co-authorship of Alzheimer's disease research.The most representative author is the Alzheimer's Disease Neuroimaging Initiative.

Figure 3 Co-institutes in the field of Alzheimer's disease.University College London from United Kingdom publishes the largest number of studies.

Figure 4 Co-countries in the field of Alzheimer's disease.

Figure 5 Analysis of co-occurring keywords in Alzheimer's disease research.Microglia is the focus of research.

The top-ranked item by citation counts is Mckhann et al.(2011) in Cluster 6 with a citation count of 2192, followed by Albert et al. (2011) in Cluster 2 with a citation count of 1285,Sperling et al. (2011) in Cluster 6 with a citation count of 1155, and Querfurth and LaFerla (2010) in Cluster 4 with a citation count of 910.

McKhann et al. (2011) reported that The National Institute on Aging and the Alzheimer's Association jointly revised the 1984 AD standards. It is hoped that the revised standards are flexible enough for general healthcare providers,clinical trial designers, or participants without neuropsychological testing, advanced imaging, and cerebrospinal fluid measurements. McKhann et al. proposed criteria for allcause dementia and AD dementia, and retained the general framework of probable AD dementia from the 1984 criteria.These authors made some changes in the clinical criteria for diagnosis and retained the term “possible AD dementia”, but redefined it in a more centralized way than before. Biomarker evidence has also been integrated into diagnostic formulations for the study of environmental causes and possible AD dementia. Core clinical criteria of AD will continue to be the cornerstone of clinical practice, but biomarker evidence is expected to enhance pathophysiological specificity when diagnosing AD. Since the revised AD standards were published, Albert et al. (2011) and Sperling et al. (2011) have separately published two studies to interpret the revised AD standards. Albert et al. (2011) verified two sets of criteria: (1)core clinical criteria for healthcare providers without access to advanced imaging techniques or cerebrospinal fluid analysis, and (2) research criteria for clinical research settings,including clinical trials. The second set of criteria involves the application of biomarkers on the basis of imaging and cerebrospinal fluid measures. According to the presence and nature of the biomarker findings, the final set of criteria for AD-induced mild cognitive impairment has four levels of certainty. Considerable work is required to validate the criteria that use biomarkers, and to standardize biomarker analysis for further use in community settings. Sperling et al. (2011) considered that the pathophysiological process of AD begins many years before a diagnosis of AD dementia.This long “pre-clinical” phase of AD will provide critical opportunities for therapeutic intervention. Based on the mainstream scientific evidence to date, Sperling et al. proposed conceptual frameworks and operational criteria to test and improve these models through longitudinal clinical studies. However, these recommendations are only for research purposes and do not currently have any clinical significance. It is hoped that these recommendations can provide a common indicator to advance preclinical AD research and ultimately improve early intervention in AD, when certain disease-modulating therapies may be most effective. Querfurth and LaFerla (2010) summarized the research into AD from many angles, including abnormal beta-amyloid and tau, synaptic failure, neurotrophin and neurotransmitter depletion, and mitochondrial dysfunction (including oxidative stress, insulin signaling pathway disorders, vascular effects, inflammation, calcium and axonal transport defects,aberrant cell-cycle reentry, and cholesterol metabolism disorders). Querfurth and LaFerla hypothesized that AD is not a single linear chain of events, and that some changes are not pathological and may in fact be protective. Therefore, as is currently used for other polygenic diseases, there is a need to develop multi-target approaches for the prevention or symptomatic treatment of AD.

Figure 7 Document co-citation analysis in Alzheimer's disease research.Mckhann et al. (2011) has the largest number of citations.

A document co-citation analysis of AD research produced 10 co-citation clusters, which were marked by indexed terms from their own citations. To characterize the nature of this clustering, CiteSpace can extract noun phrases from the titles of articles citing clustering based on three special indicators - term frequency by inverted document frequency,log-likelihood ratio, and mutual information. Log-likelihood ratio usually provides the best results in terms of uniqueness and coverage of topics associated with clustering. Table 2 summarizes the details of 11 clusters. The contour value of each cluster is greater than 0.9, which indicates reliable and meaningful results. According to the document co-citation cluster markers, it can be observed that scholars and experts use different technical means to study the generation, diagnosis, and treatment of AD. Combined with highly cited papers, biomarker diagnosis of AD is at the frontier of current research. The proteomics, cytology, pathology, and genetics of AD have been studied horizontally and deeply.

Emerging trends

Articles with citation bursts show a significant increase in research interest in the field of AD. Table 3 lists the 13 strongest references from 2015 to 2019.

Figure 6 Recurring Alzheimer's disease research after Figure 5 data are sorted into chronological order.

The first five references highlight the emerging trend of AD research in 2015, while the middle three references highlight the emerging trend for 2016-2017. The last five references were those which received great attention in 2017 and continued to 2019, and which is the focus of current AD research. Reitz et al. (2011) outlines the criteria used in AD diagnosis, highlighting that AD is associated with normal aging, but is different from normal aging. Reitz et al. also summarized the latest information on AD prevalence, incidence, and risk factors, and reviewed biomarkers that could be used for risk assessment and diagnosis. The Alzheimer's Association report (Thies, 2013) provides information to increase the understanding of AD's impact on public health,including incidence, morbidity, mortality, and health expenditures and care costs, as well as its impact on nursing staff and on society as a whole. It also explores the role and unique challenges of long-distance nurses, and interventions concerning these challenges. Harold et al. (2009) and Lambert et al. (2009) showed that the gene encoding apolipoprotein E on chromosome 19 is the only identified susceptibility locus for late-onset AD, and Lambert et al.'s large-scale genome-wide studies showed that two loci, CLU rs11136000 and CR1 rs6656401, gave replicated evidence of association.Shankar (2008) hypothesized that soluble beta-amyloid oligomers extracted from the AD brain effectively impair synaptic structure and function, and that dimers are the smallest synaptotoxic species.

Table 2 The largest 11 clusters of Alzheimer's disease document co-citation, identified by subject headings

Table 3 The top 13 references with the strongest citation bursts

The size of the cited burst value was used to measure the innovation of the research results, and is the frontier “footprint”. The larger the burst value, the greater the innovation of the research outcomes, representing the frontier of this research field. The references with high burst values are shown in Table 4.

The highest-ranked study was by Selkoe and Hardy (2016)in Cluster 4, with a burst value of 87.3. The study ranked number 2 was by Shankar et al. (2008) in Cluster 4, with a burst value of 74.98. The third-ranked study was by Sevigny et al. (2016) in Cluster 5, with a burst value of 53.22. Sevigny et al. (2016) verified that aducanumab, a human monoclonal antibody that selectively targets aggregated beta amyloid,can reduce beta-amyloid plaques in AD. Hebert et al. (2013)predicted the prevalence of AD (2010-2050) using the 2010 U.S. census. They proposed that the total number of people with AD dementia in the U.S. is expected to be 13.8 million by 2050, of whom 7 million will be aged 85 years or older.Thus, unless precautions are taken, the number of people with AD dementia in the United States will increase dramatically over the next 40 years. Gorelick et al. (2011) demonstrated the importance of blood vessels in cognitive impairment and dementia, and long-term vascular risk marker interventional studies may be needed as early as middle age to prevent or delay the onset of vascular cognitive impairment and AD. The enhancement and reduction of vascular risk factors in the high-risk population is another important research approach.

Table 4 Important Alzheimer's disease references with high burst values

Based on the World Alzheimer's Disease Report, which was prepared by the Alzheimer's International Organization in 2009, Cummings et al. (2014) participated in the recently published World Health Organization Focus Report on Dementia: Public Health. From these reports, we can understand the growing impact of AD and other dementia on our society, and the need for action. The development of a national AD plan is a key tool for this initiative.

Conclusions

CiteSpace first calculates a visualization network of AD references. Based on CiteSpace results, we discussed key clustering, the established research model, and emerging trends in references. By exploring clustering software, we identified that the main knowledge domains in AD research are biomarkers, tau protein, neuropathology, microglia, and excitotoxicity. It could be concluded from the detected citation bursts that AD diagnostic criteria are an emerging trend in AD research. The present study demonstrated a quantitative scientometric method, and explored the progress of AD research by using references published in this field. The results will be helpful for professional workers to understand visually the recognition modes and trends. Compared with reviews, CiteSpace's analysis may be controversial, and its depth is insufficient. For example, backtracking exists in CiteSpace. As shown in Figure 2, the software cannot clearly distinguish the first author from the corresponding author. Nevertheless, we believe that, with the efforts of the CiteSpace research team, this software will be updated to overcome these shortcomings and provide more accurate and in-depth knowledge domain analyses in the future. This will provide different perspectives and characteristics for professionals to recognize a domain problem.

Author contributions: Conception and design of the work: YS; acquisition, analysis, interpretation of the data, drafting of manuscript: SL, YPS,XLG. All authors approved the final version of the paper.

Conflicts of interest:The authors declare that the article content was composed in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Financial support: None.

Copyright license agreement:The Copyright License Agreement has been signed by all authors before publication.

Data sharing statement:Datasets analyzed during the current study are available from the corresponding author on reasonable request.

Plagiarism check: Checked twice by iThenticate.

Peer review:Externally peer reviewed.

Open access statement: This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-Non-Commercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms.