基于生物信息学的酒精性肝病关键基因筛选及验证
2023-06-07王一涵漆光紫李文学岑育芳韦俊宏唐玉航庞雅琴
王一涵 漆光紫 李文学 岑育芳 韦俊宏 唐玉航 庞雅琴
【摘要】 目的 利用生物信息学方法探索酒精性肝病(alcoholic liver disease,ALD)潜在的关键基因并通过实验验证,为寻找ALD潜在的生物标志物提供依据。方法 从美国国立生物技术信息中心(National Center for Biotechnology Information,NCBI)的公共基因芯片数据平台(Gene Expression Omnibus,GEO)的数据库下载两个基因表达谱芯片(GSE28619和GSE100901),利用GEO2R筛选出酒精性肝病实验组与正常对照组的差异表达基因(differentially expressed genes,DEGs),对DEGs进行基因本体论(gene ontology,GO)与京都基因和基因组百科全书(Kyoto encyclopedia of genes and genomes ,KEGG)信号通路的富集分析,进一步应用STRING数据库构建蛋白质的相互作用网络,用Cytoscape来筛选出关键基因。构建ALD小鼠模型,通过RT-qPCR验证筛选出关键基因。结果 总共鉴定出173个DEGs,GO显示DEGs生物学功能主要涉及5个KEGG通路,包括补体和凝血级联、胆固醇代谢、视黄醇代谢、药物代谢-细胞色素P450、胆汁分泌相关信号通路,结合蛋白质相互作用网络(protein-protein interaction,PPI)和CytoHubba的结果,筛选出SERPINC1、AHSG、FGG、FGA、ITIH3、FGB、APOB、ALB和APOH 9个关键基因,通过RT-qPCR检測验证,发现与WT小鼠相比,ALD小鼠肝脏ALB、APOB和FGB 的mRNA表达上调(P<0.05),而ITIH3、FGG和SERPINC1的 mRNA表达下调(P<0.05)。结论 ALB、APOB、 FGB、ITIH3、FGG、SERPINC1 有望成为ALD潜在的生物标志物。
【关键词】 生物信息学;酒精性肝病;基因;生物标志物
中图分类号:R575.5;Q811.4 文献标志码:A DOI:10.3969/j.issn.1003-1383.2023.05.002
【Abstract】 Objective To explore the potential key genes of alcoholic liver disease (ALD) by bioinformatics methods and to validate them through experiments, so as to provide basis for searching for potential biomarkers of ALD. Methods Two gene expression profile chips (GSE28619 and GSE100901) were downloaded from the database of the public GeneChip Data Platform(Gene Expression Omnibus, GEO) of the National Center for Biotechnology Information (NCBI) in the United States. GEO2R was used to select differentially expressed genes (DEGs) in ALD group and control group. The enrichment analysis of gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) signaling pathways in DEGs was conducted. Furthermore, STRING database was applied to construct the interaction network of proteins, and key genes were screened by Cytoscape. ALD mice models were constructed and key genes were screened by RT-qPCR validation.Results A total of 173 DEGs were identified. GO showed that the biological functions of DEGs mainly involved in five KEGG pathways, including complement and coagulation cascade, cholesterol metabolism, retinol metabolism, drug metabolism-cytochrome P450, and bile secretion-related signaling pathways. Based on the results of protein-protein interaction network (PPI) and CytoHubba, 9 key genes, namely, SERPINC1, AHSG, FGG, FGA, ITIH3, FGB, APOB, ALB, and APOH, were screened out. RT-qPCR detection found that compared with WT mice, the mRNA expressions of ALB, APOB, and FGB in the liver of ALD mice upregulated (P<0.05), while the mRNA expressions of ITIH3, FGG, and SERPINC1 downregulated (P<0.05).Conclusion ALB, APOB, FGB, ITIH3, FGG and SERPINC1 are expected to be potential biomarkers for ALD.
【Key words】 bioinformatics; alcoholic liver disease(ALD); gene; biomarkers
酒精性肝病(alcoholic liver disease,ALD)俗称酒精肝,是全球最普遍的肝病之一[1],根据世界卫生组织的调查,2016年全球酒精导致3000万人死亡,占全球死亡人数的5.3%,占全球疾病负担的5.1%[2]。近年来,ALD导致的病死率有所增加。在美国,酒精已超过丙型肝炎病毒成为肝脏相关疾病死亡的主要病因,自2007年以来ALD年龄标准化病死率持续性增长,年增长率为3.1%[3]。酒精通过抑制线粒体β氧化和增加脂肪酸合成,导致甘油三酯、磷脂和胆固醇脂的积累,从而诱导脂肪在肝脏中沉积,形成酒精性脂肪性肝炎并伴有肝细胞损伤和气球样改变等病理特征[4],大量饮酒导致肝脏慢性炎症和细胞外基质的沉积和纤维化,而晚期纤维化可导致肝脏结构紊乱和肝实质纤维化,进展至肝硬化阶段[5]。慢性的肝损伤、氧化应激炎症、纤维化和酒精代谢物的致癌作用可能导致DNA突变,使疾病向肝细胞癌的方向发展[6-7]。目前ALD的诊断仍需通过活检来进行,但由于其是一种侵入性的检查手段,患者的接受程度普遍不高,迫切需要寻找ALD发展相关的生物标志物用于筛查和诊断。
通量技术在潜在生物标志物中应用广泛[9],临床生物信息学作为一种新兴的研究方法,是疑难杂症的诊断、治疗和预后等方面很有前景的研究方法之一[10]。这些方法已广泛应用于肝癌、胃癌等癌症的检查[11-13]。在一些非腫瘤疾病中,通过生物信息学分析也发现了许多有价值的新型生物标志物[14-16]。因此,本研究通过生物信息学初步探索ALD的生物标志物和分子机制,预测与ALD相关的潜在关键基因,并通过ALD小鼠模型,对其肝组织采用RT-qPCR进行关键基因验证,为ALD新生标志物的发现和应用提供借鉴价值。
1 材料与方法
1.1 数据来源
从GEO数据库(https://www.ncbi.nlm.nih.gov/geo/)中筛选出两个mRNAs基因芯片数据集(GSE28619和GSE100901),筛选标准是两个数据集共同差异基因较多。GSE28619数据库芯片平台是GPL570[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array;GSE100901数据库芯片平台是GPL13667[HG-U219]Affymetrix Human Genome U219 Array。
1.2 差异表达基因(differentially exprsssed gense,DEGs)筛选
利用GEO2R(http://www.ncbi.nlm.nih.gov/geo/geo2r)对ALD组和对照组的样本进行DEGs分析统计,以P<0.05和|log2(FC)|>1.0为标准筛选出DEGs。用“ggplot2”R语言包可视化绘制火山图,用 R包(ComplexHeatmap)可视化制成热图,用可视化软件Funrich(http://funrich.org/)生成韦恩图。
1.3 DEGs的基因本体论(gene ontology,GO)及京都基因和基因组百科全书(Kyoto encyclopedia of genes and genomes,KEGG)分析
通过在线数据库DAVID(https://david.ncifcrf.gov/),分析GO功能富集数据,分别通过生物过程(biological process,BP)、细胞成分 (cellular component,CC)、分子功能(molecularfunction,MF)对分析的基因进行注释和分类。KEGG包含生物学功能、疾病、化学物质、药物和生物学通路等数据集。统计方法为EASE Score或Fisher Exact,GO和KEGG各项筛选条件为P<0.05[17]。根据分析结果用“ggplot2”R语言包绘制气泡图。
1.4 DEGs的PPI网络构建和关键基因筛选
STRING(https://string-db.org/)可用来呈现和评估PPI网络[18]。将本研究筛选出的所有DEGs导入STRING,利用 STRING分析工具,进一步探寻DEGs之间潜在的联系[19]。筛选条件设置为:置信度≥0.15,互作最大值为0。然后把STRING的计算结果导入 Cytoscape软件[20],挖掘PPI网络中连接最为紧密的集合[21],应用的是分子复合物检测(molecular complex detection,MCODE)插件,参数设置为默认筛选参数。此外应用cytoHubba插件筛选出PPI网络中前几个关键位置的基因。
1.5 构建ALD小鼠模型,通过RT-qPCR验证筛选出关键基因
本研究采集了广州疾病预防控制中心毒理科共包括3只ALD小鼠(喂常规饲料和10%酒精为唯一饮料,连续喂养30 d)和3只正常的WT小鼠(喂常规饲料和水)的肝组织,实验动物使用许可证号为SYXK(粤)2018-0002。采集肝脏样本后立即在液氮中冷冻,并根据既定方案提取RNA。根据说明书使用PrimeScript RT reagent Kit with gDNA Eraser(Takara)试剂盒进行互补DNA(cDNA)合成,扩增条件设置如下:95 ℃启动10分钟,进行40个循环,95 ℃持续15秒,60 ℃持续30秒,以小鼠的β-actin作为内参(参照基因),使用2-ΔΔCt计算。本研究中使用的引物序列见表1。
1.6 统计学方法
WT小鼠和ALD小鼠的肝组织9个关键基因mRNA的表达资料采用SPSS 25.0建立数据库并进行统计分析。mRNA的表达水平用(±s)描述,两组之间的比较采用独立样本t检验。同时利用GraphPad Prism 7软件对WT小鼠和ALD小鼠的肝组织9个关键基因mRNA的表达进行作图。检验水准:α=0.05,双侧检验。
2 结 果
2.1 DEGs筛选结果
从GEO数据库中检索出两个数据集基因表达谱(GSE28619和GSE100901),包括健康对照样本和ALD实验样本。GSE28619包含了3个健康对照样本和3个ALD实验样本,与对照组相比,ALD组有911个上调基因,684 个下调基因(|log2(FC)|>1.0,P<0.05);GSE100901包含了8个健康对照样本和8个ALD实验样本,与对照组相比,ALD组有2259个上调基因,2867个下调基因(|log2(FC)|>1.0,P<0.05),绘制可视化的火山图和热图(图1,图2)。比较所有DEGs的基因表达谱,并绘制韦恩图(图3),共有173个基因为两个数据集共有的DEGs。
2.2 DEGs的GO和KEGG分析结果
采用DAVID数据库对总的DEGs进行GO和KEGG 信号通路富集分析,结果如图4所示,GO功能包括BP、MF、CC,GO富集分析后发现,DEGs主要定位在含胶原蛋白的细胞外基质、血液微粒、血浆脂蛋白颗粒、脂蛋白颗粒、高密度脂蛋白颗粒成分;分子功能主要涉及糖胺聚糖结合、肝素结合、肽酶抑制剂活性、内肽酶抑制剂活性、硫化合物结合;生物过程主要包含细胞外的结构组织、急性炎症反应、蛋白质活化、羧酸生物合成过程、有机酸生物合成过程;KEGG分析显示,DEGs主要与补体和凝血级联、胆固醇代谢、视黄醇代谢、药物代谢-细胞色素P450、胆汁分泌信号通路有关。
2.3 DEGs的PPI网络分析结果和关键基因筛选
在STRING数据库的基础上,通过Cytoscape对DEGs进行PPI分析并进行可视化(图5),根据Cytoscape产生的节点度评分,将SERPINC1、AHSG、FGG、FGA、ITIH3、FGB、APOB、ALB和APOH 前9个基因作为潜在的核心基因(图6)。
2.4 RT-qPCR验证WT小鼠和ALD小鼠的肝组织9个关键基因mRNA的表达情况
对WT小鼠和ALD小鼠肝脏的肝细胞进行RNA提取,并通过RT-qPCR检测,结果显示,ALD小鼠ALB、APOB和FGB的 mRNA表达均高于WT小鼠,差异均有统计学意义(P<0.05);ALD小鼠ITIH3、FGG和SERPINC1的mRNA表达均低于WT小鼠,差异均有统计学意义(P<0.05)(表2,图7)。结果提示,与WT小鼠肝组织相比,ALB、APOB和FGB的mRNA在ALD小鼠肝組织表达上调,与GEO数据库中的测序结果差异表达趋势一致,ITIH3、FGG和SERPINC1的mRNA在ALD小鼠肝组织表达下调,与GEO数据库中的测序结果差异表达趋势一致。
3 讨 论
ALD的发病机制是多因素的,包含环境、遗传和生活习惯等[22]。在我们的研究中,总共鉴定出192个DEGs,GO功能分类结果表明,DEGs生物学功能主要涉及5个KEGG通路,包括补体和凝血级联、胆固醇代谢、视黄醇代谢、药物代谢-细胞色素P450、胆汁分泌相关信号通路,结合PPI和CytoHubba的结果,筛选出SERPINC1、AHSG、FGG、FGA、ITIH3、FGB、APOB、ALB和APOH 前9个关键基因,并且通过RT-qPCR检测发现,与正常鼠肝组织相比,ALB、APOB和FGB的mRNA在ALD组表达上调,ITIH3、FGG和SERPINC1的mRNA在ALD组表达下调。
ALB、APOB、FGB、ITIH3、FGG和SERPINC1多在肝功能和免疫反应方面起作用。ALB的主要功能是调节血液的胶体渗透压[23],作为血浆中锌、钙和镁主要的转运蛋白[24],当肝脏损伤、血液循环不畅和水肿时,ALB升高[25]。APOB是乳糜微粒(ApoB-48)、低密度脂蛋白(ApoB-100)和VLDL (ApoB-100)的主要蛋白质成分[26],其作为一种识别信号,通过载脂蛋白b/E受体与细胞结合并内化为LDL颗粒[27],可能对ALD脂肪变化进程有推进作用。纤维蛋白原β(FGB)被蛋白酶凝血酶切割产生单体,其与纤维蛋白原α(FGA)和纤维蛋白原γ(FGG)一起聚合形成不溶性纤维蛋白基质[28],而纤维蛋白沉积也与感染有关,它可以防止IFNG介导出血[29],还可以通过先天性和T细胞介导的途径促进抗菌免疫反应[30],可能与ALD引起的肝脏炎症有关。ITIH3作为血清中透明质酸的载体,是透明质酸与其他基质蛋白(包括组织中细胞表面的细胞表面的蛋白质)之间的结合蛋白,可调节透明质酸的定位、合成和降解,这对于经历生物过程的细胞至关重要[31],有止血、血小板活化的作用,可能与ALD引起肝功能改变有关[32]。FGG与FGA和FGB一起聚合形成不溶性纤维蛋白基质[33],在止血中具有主要功能[34]。SERPINC1是血浆中最重要的丝氨酸蛋白酶抑制剂,可调节凝血级联反应,在肝素存在下其抑制活性极大增强[35],可能与ALD引起肝功能改变有关。我们的研究结果将为下一步探索ALD潜在的生物标志物、疾病发生发展的病理生理机制和新治疗的方案提供基础和依据。
参 考 文 献
[1] DANIELS S J,LEEMING D J,ESLAM M,et al.ADAPT:an algorithm incorporating PRO-C3 accurately identifies patients with NAFLD and advanced fibrosis[J].Hepatology,2019,69(3):1075-1086.
[2] EDGE S B,COMPTON C C.The American Joint Committee on Cancer: the 7th edition of the AJCC cancer staging manual and the future of TNM[J].Ann Surg Oncol,2010,17(6):1471-1474.
[3] ZHOU Z,RUAN B L,HU G.Anti-intoxication and protective effects of a recombinant serine protease inhibitor from Lentinula edodes against acute alcohol-induced liver injury in mice[J].Appl Microbiol Biotechnol,2020,104(11):4985-4993.
[4] DUGUM M,MCCULLOUGH A.Diagnosis and management of alcoholic liver disease[J].J Clin Transl Hepatol,2015,3(2):109-116.
[5] SINGAL A K,BATALLER R,AHN J,et al.ACG Clinical Guideline: alcoholic liver disease[J].Am J Gastroenterol,2018,113(2):175-194.
[6] LIU J.Ethanol and liver: recent insights into the mechanisms of ethanol-induced fatty liver[J].World J Gastroenterol,2014,20(40):14672-14685.
[7] WALESKY C,EDWARDS G,BORUDE P,et al.Hepatocyte nuclear factor 4 alpha deletion promotes diethylnitrosamine-induced hepatocellular carcinoma in rodents[J].Hepatology,2013,57(6):2480-2490.
[8] TORKADI P P,APTE I C,BHUTE A K.Biochemical evaluation of patients of alcoholic liver disease and non-alcoholic liver disease[J].Indian J Clin Biochem,2014,29(1):79-83.
[9] PHALLEN J,SAUSEN M,ADLEFF V,et al.Direct detection of early-stage cancers using circulating tumor DNA[J].Science Translational Medicine,2017,9(403):eaan2415.
[10] LI R,SIM I.How clinical trial data sharing platforms can advance the study of biomarkers[J].J Law Med Ethics,2019,47(3):369-373.
[11] TSAI S,GAMBLIN T C.Molecular characteristics of biliary tract and primary liver tumors[J].Surg Oncol Clin N Am,2019,28(4):685-693.
[12] YAN P,HE Y,XIE K,et al.In silico analyses for potential key genes associated with gastric cancer[J].PeerJ,2018,6:e6092.
[13] YE B,SMERIN D,GAO Q,et al.High-throughput sequencing of the immune repertoire in oncology:applications for clinical diagnosis,monitoring,and immunotherapies[J].Cancer Lett,2018,416:42-56.
[14] YUAN L Q,de JESUS P V,LIAO X B,et al.MicroRNA and cardiovascular disease 2016[J].BioMed Res Int,2017,2017:3780513.
[15] CHEN L,SU W,CHEN H,et al.Proteomics for biomarker identification and clinical application in kidney disease[J].Adv Clin Chem,2018,85:91-113.
[16] GENG R X,LI N,XU Y,et al.Identification of core biomarkers associated with outcome in glioma:evidence from bioinformatics analysis[J].Dis Markers,2018,2018:3215958.
[17] RAO L,SONG Z,YU X,et al.Progranulin as a novel biomarker in diagnosis of early-onset neonatal sepsis[J].Cytokine,2020,128:155000.
[18] STANSKI N L,STENSON E K,CVIJANOVICH N Z,et al.PERSEVERE biomarkers predict severe acute kidney injury and renal recovery in pediatric septic shock[J].Am J Respir Crit Care Med,2020,201(7):848-855.
[19] WANG S,XIAO C,LIU C,et al.Identification of biomarkers of sepsis-associated acute kidney injury in pediatric patients based on UPLC-QTOF/MS[J].Inflammation,2020,43(2):629-640.
[20] YEHYA N,WONG H R.Adaptation of a biomarker-based sepsis mortality risk stratification tool for pediatric acute respiratory distress syndrome[J].Crit Care Med,2018,46(1):e9-e16.
[21] WEISS S L,PETERS M J,ALHAZZANI W,et al.Surviving sepsis campaign international guidelines for the management of septic shock and sepsis-associated organ dysfunction in children[J].Pediatr Crit Care Med,2020,21(2):e52-e106.
[22] JANG J,PARK S,JIN H H,et al.25-hydroxycholesterol contributes to cerebral inflammation of X-linked adrenoleukodystrophy through activation of the NLRP3 inflammasome[J].Nat Commun,2016,7:13129.
[23] MALMSTROM E,KILSGARD O,HAURI S,et al.Large-scale inference of protein tissue origin in gram-positive sepsis plasma using quantitative targeted proteomics[J].Nat Commun,2016,7:10261.
[24] READ S A,O'CONNOR K S,SUPPIAH V,et al.Zinc is a potent and specific inhibitor of IFN-lambda3 signalling[J].Nat Commun,2017,8:15245.
[25] WANG C Y,XIAO X,BAYER A,et al.Ablation of hepatocyte Smad1,Smad5,and Smad8 causes severe tissue iron loading and liver fibrosis in mice[J].Hepatology,2019,70(6):1986-2002.
[26] ENKAVI G,JAVANAINEN M,KULIG W,et al.Multiscale simulations of biological membranes:the challenge to understand biological phenomena in a living substance[J].Chem Rev,2019,119(9):5607-5774.
[27] WEI W Q,LI X,FENG Q,et al.LPA variants are associated with residual cardiovascular risk in patients receiving statins[J].Circulation,2018,138(17):1839-1849.
[28] LIU X,SPERANZA E,MUNOZ-FONTELA C,et al.Transcriptomic signatures differentiate survival from fatal outcomes in humans infected with Ebola virus[J].Genome Biol,2017,18(1):4.
[29] LUYENDYK J P,SCHOENECKER J G,FLICK M J.The multifaceted role of fibrinogen in tissue injury and inflammation[J].Blood,2019,133(6):511-520.
[30] ANDREJEVA G,RATHMELL J C.Similarities and distinctions of cancer and immune metabolism in inflammation and tumors[J].Cell Metab,2017,26(1):49-70.
[31] OBRY A,HARDOUIN J,LEQUERR T,et al.Identification of 7 proteins in sera of RA patients with potential to predict ETA/MTX treatment response[J].Theranostics,2015,5(11):1214-1224.
[32] TOLEDO A G,GOLDEN G,CAMPOS A R,et al.Proteomic atlas of organ vasculopathies triggered by Staphylococcus aureus sepsis[J].Nat Commun,2019,10(1):4656.
[33] HAPPONEN L,HAURI S,SVENSSON B G,et al.A quantitative Streptococcus pyogenes-human protein-protein interaction map reveals localization of opsonizing antibodies[J].Nat Commun,2019,10(1):2727.
[34] SABINO M,CARMELO V,MAZZONI G,et al.Gene co-expression networks in liver and muscle transcriptome reveal sex-specific gene expression in lambs fed with a mix of essential oils[J].BMC Genom,2018,19(1):236.
[35] JIANG P,GU S,PAN D,et al.Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response[J].Nat Med,2018,24(10):1550-1558.
(收稿日期:2023-03-15 修回日期:2023-04-06)
(編辑:黄研研)
基金项目:国家自然科学基金(81960595,81360438,81660549);广西自然科学基金(2019JJD140011);2022年右江民族医学院硕士研究生创新计划项目(YXCXJH2022010)
作者简介:王一涵,男,住院医师,在读硕士研究生,研究方向:分子毒理学。E-mail:452158749@qq.com
通信作者:庞雅琴。E-mail:pangyaqin@126.com