Type 1 diabetes loci display a variety of native American and African ancestries in diseased individuals from Northwest Colombia
2019-11-15NataliaGomezLoperaJuanAlfaroSuzanneLealNicolasPinedaTrujillo
Natalia Gomez-Lopera,Juan M Alfaro,Suzanne M Leal,Nicolas Pineda-Trujillo
Abstract
Key words:Type 1 diabetes;Genetic admixture;Native American;Idiopathic;Colombia selected by an in-house editor and fully peer-reviewed by external reviewers.It is distributed in accordance with the Creative Commons Attribution Non Commercial(CC BY-NC 4.0)license,which permits others to distribute,remix,adapt,build upon this work non-commercially,and license their derivative works on different terms,provided the original work is properly cited and the use is non-commercial.See:http://creativecommons.org/licen ses/by-nc/4.0/
INTRODUCTION
Type 1 diabetes(T1D)is a heterogeneous disease with pathogenic processes and phenotypic characteristics that show marked variation.It is accepted that genetic effects are an important factor for this heterogeneity.HLAconfers the major genetic susceptibility to T1D,contributing up to 50%;it is located on chromosome 6p21[1].In addition,over 50 non-HLA genes(so far)increase susceptibility to T1D[2,3].Recently,we have identified thatRNASEH1gene variants associate with T1D in Northwest Colombia[4].This gene,which is located on chromosomal region 2p25,has not thus far been associated elsewhere with the disease.A wide geographical variation in the incidence of T1D both among and within countries has been reported[5].Incidence of T1D is higher in Europeans[6-8]than in Latin American countries[7,8].Genetic admixture is a factor that influences allelic frequencies in a population;this,in part,may contribute to explaining the differences observed in T1D epidemiology.
Three studies in Latin America have tested the admixture effect on T1D.Two of these were carried out in Brazil[9,10]and the third in Cuba[11].These three studies found that T1D patients are mostly of European descendant and not necessarily different than controls.Thus,one Brazilian study and the one from Cuba reported that patients carried a greater European component than their controls;this observation was established as a risk factor[9,11].
In Colombia,the admixture process was produced differently in each region of the country.Populations in southern Colombia show higher values of Native American ancestry(NAT,average 60%),whilst African(AFR)ancestry is more observed in the region of Chocó(average 68%)and the Caribbean coast(average 30%)[12-14].On the other hand,northwest Colombia,inhabited by the“paisa”population,exhibits the highest percentage of European ancestry,which ranges in studies from 47-79%[15-19].In Colombia,the admixture effect has been examined for some complex diseases such as type 2 diabetes[20],asthma[21],cancer[22,23],dengue patients[24],Alzheimer’s disease[17],as well as for cardio-metabolic parameters[25].
Although much of the work on the admixture effect on several phenotypes has been done in Latin America and Colombia,none has tested this effect on T1D in Colombian patients.Our purpose was to analyze the genetic admixture composition of a set of Colombian T1D patients,by testing previously reported admixture informative markers(AIMs)in the vicinity of previously reported T1D candidate genes/loci.Besides,two chromosomal regions of high relevance to T1D in our population were tested more thoroughly.These loci were6p21(HLA),which is globally accepted as the T1D master risk locus,and2p25(RNASEH1),which has been reported solely in Colombia,so far.We inferred individual patient proportions of European,AFR and NAT ancestry components.Although the European component was higher than the two other parental contributions in a global analysis,some loci are clearly non-Europeans in casesvsthe reference population,or between T1D categories.This study shed light on the genetics of T1D in a Colombian population,and reinforces the importance of including different approaches when looking for T1D genetic architecture.This is suggested by finding no admixture differences in strongly associated T1D loci,such as HLA(IDDM1)orIDDM2.In contrast,a strong genetic admixture effect was observed for other loci not described as high determinants for developing T1D.For instance,this was the case for chromosomal regions5p13.2and10p11.22.
MATERIALS AND METHODS
Study population
The study group consisted of 200 Colombian individuals with T1D.Their age at onset was<15 years.Diagnostic criteria were according to the American Diabetes Association[26].Patients were considered as“Paisas”according to a self-reported questionnaire asking for their geographical origin back until their great-grandparents.Other questions included gender,age at onset,and other family members with autoimmune diseases.
Patients were identified in the main pediatric endocrinology institutes from Antioquia:Program of Pediatric Endocrinology(Universidad de Antioquia and Hospital San Vicente Fundación),IPS Universitaria,Universidad Pontificia Bolivariana,Instituto Antioqueño de Diabetes and Clinica Integral de Diabetes.This study was approved by the ethics committee of the Faculty of Medicine at Universidad de Antioquia.Informed consent was obtained from patients and their parents before drawing blood samples.
Auto-antibodies testing
Two diabetes-related autoantibodies(AABs)were tested in sera samples from the 200 patients.These AABs were glutamic acid decarboxylase(GAD-65 kDa)and protein tyrosine-like antigen-2(IA-2),as reported previously[4].They were measured using a commercial ELISA-based kit(AESKULISA and LifeSpan BioSciences,Inc)according to the manufacturer's instructions.If a patient presented with at least one of these AABs,he/she was classified as autoimmune(T1AD),or was otherwise classified as idiopathic(T1BD).
Genotyping and admixture estimation
Genomic DNA was isolated from peripheral blood samples using either the phenolchloroform or salting out protocols.A set of 75 AIMs was tested in 200 T1D patient samples using the Competitive genotyping Allele-Specific PCR technology(KASP™),which was undertaken by the Company LGC Genomics Ltd.Details of this method can be obtained from https://www.lgcgroup.com/genotyping/.
The AIMs used have a high discriminatory power(δ>45%)among ancestral populations(Supplementary Table S1),which increases the statistical power for estimating individual ancestry.We selected these markers from Latino populations panels reported by Maoet al[27],Galanteret al[28]and Ruiz-Linareset al[29].The AIMs were distributed throughout the genome,tagging previously reported T1D candidate loci.However,we chose a higher density of markers for chromosome 2(23 AIMs)where theRNASEH1gene is;and for chromosome 6(18 AIMs)where the HLA region is.
The 1,000 genome database was used to extract genetic information from 94 Colombians living in Medellin(CLM)for the 74 AIMs successfully typed(ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/).These population individuals are from the same geographical region as the patients.We calculated allele and genotypic frequencies,and Hardy-Weinberg equilibrium(HWE)using PLINK v.1.07[30].In addition,we considered markers that were not in linkage disequilibrium with each other.We used markers with a genotyping rate higher than 95%,without significant deviations from HWE after a Bonferroni correction.
We estimated individual European,NAT and AFR ancestry proportions using ADMIXTURE software[31].The proportions of each component were estimated using a supervised-learning strategy,providing the genotypes of 74 AIMs from reference populations AFR,European and NAT(k=3).We used 74/75 AIMs since one failed the PCR optimization.
To find the parental population allele frequencies,genotypes from 165 Europeans(Utah residents with ancestry from northern Europe and the West,named as CEU),and 165 AFR(Yoruba people in Ibadan,Nigeria,named as YRI)genotyped in the HapMap project were selected,which are deposited in the 1,000 genome database.Since we did not have access to NAT DNA samples or publicly available NAT genotype data on all 74 AIMs,we generated the genotypes of the 74 AIMs for 150 simulated individuals,according to the allele frequencies of NAT previously reported in the panels.
Statistical analysis
Comparison between grou ps for continuous variables that did not comply with normal distribution was performed using the Mann-WhitneyU-test.Thus,comparisons of ancestry medians between T1D subtypes(T1AD and T1BD)according to AABs,and individuals with early/late age at onset(i.e.≤5 years or>5 years,respectively)were performed.In addition,these comparisons were also done to the CLM population.We performed these analyses for AIMs distributed across the set of candidate loci and independently for loci at different chromosomes.We ran all statistical analyses and graphs in the R package V3.3.3[32].We also tested allelic association of these AIMs between T1D and CLM using PLINK 1.07[30].
RESULTS
T1D versus CLM,our reference population
One out of 75 AIMs did fail the PCR optimization.Therefore,we tested a total of 74 AIMs in 200 T1D patients from Antioquia,Colombia.AIMs characteristics are shown in Supplementary Table S1.Overall,the rate of genotyping was>96% for every AIM,and there was no deviation from the HWE,after Bonferroni correction for multiple testing(P=6.75×10-4).Also,as expected,none of the AIMs was in linkage disequilibrium with each other(data not shown).
The overall ancestral genetic makeup of the 200 T1D children showed a predominant proportion of European ancestry(EUR,Median=61.58)followed by NAT ancestry(Median=27.34),and AFR ancestry was found at a lower proportion(Median=10.28,Table 1 and Figure 1).Figure 1 presents the ancestry distribution for the 200 T1D children studied here.It can be noticed that the European component is the prominent one.European ancestry ranged from 22% to 93%;the NAT ancestry ranged from 0 to 65%,and the AFR ancestry ranged from 0 to 40%.
Looking at the overall set of AIMs,and also at their distribution in specific loci,it was observed that diseased individuals of EUR ancestry had a median from 61.58-11.56.The lowest value was found for chromosome 5 AIMs(Table 1).NAT ancestry ranged from 52.18-24.30 in the diseased subjects.The highest value was found for geneIL7RAIMs(chromosome 5),and the lowest value was found for geneEFR3BAIMs(chromosome 2).The AFR component(AFR)ranged from 20.58 to 0.01.the lowest AFR ancestry was found for geneIFIH1AIMs(chromosome 2,Table 1).The wide ancestry variation across chromosomal regions is noticeable.
Overall,the CLM reference population displayed a very similar ancestry distribution compared to T1D cases.Nonetheless,specific T1D loci presented marked differences between the two groups;one such difference was observed for the geneEFR3B,which presented with higher NAT in the CLM population(P=0.02),suggesting a protective role for developing T1D(Table 1).Also,at geneIFIH1,T1D patients presented with lower European ancestry(P=0.05),at the expense of a higher NAT component than in CLM(Table 1).Other differences between T1D and CLM were observed for theIL7RandNRP1genes(Chromosomes 5 and 10,respectively)asfollows.
Table 1 Genetic ancestry of type 1 diabetes patients compared to Colombians living in Medellin control population
Chromosome 5 AIMs(geneIL7R)showed less European(P=7.0×10-3)and less AFR ancestries(1.56×10-5)in diseased subjects than the CLM population;consequently,T1D patients had more NAT ancestry than CLM subjects at this chromosomal region(P=1.0×10-4,Table 1).Regarding chromosome 10 AIMs(geneNRP1),it was observed that T1D patients presented a high European component compared to CLM(63.32vs0.03,Table 1).Conversely,patients presented an almost zero AFR component for this chromosomal region compared to CLM(0.03vs94.23,Table 1).Consequently,T1D patients displayed a predominance of NAT ancestry at this locus compared to CLM(36.67vs0.003,Table 1).
An exploratory association analysis showed that,after adjusting for admixture,seven markers were associated with T1D(Supplementary Table S2 and Table 2).The most significant findings were located on chromosomes 5 and 10(P=5.56×10-6and 8.70×10-19,respectively).It is interesting that only oneMHCmarker(rs2395656)presented an association with the disease,and this happened with less strength in its association(P=0.04)than markers at chromosomes 5 and 10(Table 2).
Ancestral components considering T1D subtypes(according to autoimmunity and age at onset)
We stratified the T1D sample according to the presence(T1AD,autoimmune)or absence(T1BD,idiopathic)of diabetes-related AABs;we also stratified the patient group according to their age at onset,e.g.,early(≤5 years)or late(>5 years).We found that 78%(n=156)of the patients had at least one T1D specific autoantibody(GAD-65 and IA-2),while the 22% remaining(n=44)were negative for these two antibodies.T1AD average age at onset was 8.25 years,whilst for T1BD it was 7.22.We did not find significant differences between men and women within these two groups(data not shown).
Figure 1 Ancestry proportions of 200 type 1 diabetes patients from Colombia.EUR:European;NAT:Native American;AFR:African.
Over thirty percent(n=61,30.5%)of T1D individuals developed the disease before the age of 5 years,with an average age at onset of 2.66 years.The remaining sample(69.5%,n=139)presented with an age at onset after 5 years,with a mean for this category of 10.28 years.As in the stratification by AABs,we did not find significant differences between men and women within the age at onset categories(data not shown).Regarding the autoimmune category,comparisons among ancestral genetic composition led to the identification of no differences for the 74 AIMs taken together(Table 3).However,looking at individual loci,it was observed thatMHCAIMs present with lower NAT ancestry in the autoimmune subgroup(P=0.019,Table 3).
In addition,when comparing diseased individuals in the autoimmune categories to CLM population,it was observed that geneEFR3BAIMs present differences in their ancestral components(Supplementary Table S3).Thus,autoimmune patients presented with less NAT ancestry(P=0.032),whilst T1D idiopathic category presented with higher AFR ancestry(P=0.016).Regarding the age at onset categories,it was observed that the AFR ancestry is significantly higher in the late onset subgroup at geneIL7RAIMs(P=0.023,Table 4).Comparing these two categories to the CLM population showed no significant differences for either the overall set of AIMs nor specific loci(Supplementary Table S3).
DISCUSSION
T1D incidence differences among countries,mainly related to Europeanversusnon-Europeans,led us to assess whether our T1D patients had a predominantly European ancestral component or other.Our analyses were based on 74 AIMs located on previously reported T1D loci/genes.AIM deltas(δs)between the NAT,European and AFR populations indicated that they were appropriate discriminators.We found that T1D patients from northwest Colombia are predominantly of European ancestry,followed by NAT and AFR components.Proportion estimates of the three parental populations for this sample were consistent with those reported in previous studies for Colombians,but using different sets of markers[13,16,19,20,29].
We also compared T1D children to CLM.Analyzing the overall set of AIMs found no statistically significant differences in the ancestral genetic component between the two groups.Comparable results were obtained by Gomeset al[10]in Sao Paulo-Brazil;they noted that the European component predominated in both T1D patients and controls,followed by AFR and NAT ancestry;however,no significant differencesbetween cases and controls were observed.For the contrary,a study conducted in ten Brazilian cities showed that T1D patients presented a higher percentage of European component than the healthy population[9].Similarly,a study by Diaz-Hortaet al[11]found a higher proportion of European component in cases than in controls.Even more,they found a risk association with the European ancestry.
Table 2 Significant findings in an exploratory association analysis
Further analysis disaggregating the candidate loci tested led us to find a different ancestry composition forMHCAIMs.Lower NAT ancestry was observed in T1AD compared to T1BD patients(Table 3).Ancestry variation at the HLA region has been reported for Latin American populations.However,such variation has shown an excess of the AFR component in these populations,including CLM[16,33,34].It has been suggested that the excess of the AFR component in the HLA region in Latin America is due to a positive selection orchestrated by the presence of infectious agents during the process of the conquest.The European conquerors brought to America,African and European diseases such as smallpox,measles,and influenza,which caused massive epidemics and were responsible for the extinction of many native populations[34].Given this historical background,these AFR fragments could obtain a selective advantage,since the AFR populations have the most diverse repertoire in HLA[35,36].However,the ancestry variation observed here shows that the European component is higher in autoimmune(T1AD)than T1BD,in combination with lower NAT in T1AD than T1BD(Table 3).
Another gene with remarkable findings isIFIH1.This observation is of particular interest to our population,since we had found in the past that SNPrs10930046,which is located atIFIH1,associates with T1D in our population[37].This SNP has been reported as a rare variant in European populations(MAF=0.02)related to Psoriasis[38].Interestingly,we found in our previous study that this variant MAF=0.3[37].Therefore,such an allele frequency difference could have been speculatively explained by random genetic drift,involving over-representation of European chromosomes with such variants at the time of conquering Colombia.However,in the present study,evidence suggests that this allele frequency difference between populations might be a NAT contribution.
It is worth mentioning thatIFIH1AIMs presented wide values for the AFR component comparing autoimmune to idiopathic patients(14.99vs0.01,Table 3),without reaching statistical significance.This was the case since the interquartile range overlapped between these two autoimmune categories.Neither geneCTLA4norRNASEH1AIMs revealed significant contributions to T1D,either looking to the overall set of AIMs or in any of the loci/genes analyzed.RegardingCTLA4,this observation makes sense when related to our previous finding of no association of this gene variant with T1D[37].However,a different situation holds for theRNASEH1gene.
RNASEH1gene variants have thus far been associated with T1D only in the northwest Colombia population and not elsewhere in the world[4].It has not even been reported in GWA studies using large sample sizes,albeit mostly of European origin[3].Analyzing a larger sample size of T1D patients from this region in Colombia will allow us to conclude whether there really is an ancestry effect related toRNASEH1gene variants in T1D.
Unexpectedly,we found that ancestry for chromosomes 5 and 10 were sharply different between T1D patients and the CLM population(Tables 1 and 3).The formerinvolves chromosomal region5p13.2(IL7R)[39].This region was assessed with only one AIM,which clearly discriminates between NAT and non-NAT(Supplementary Table S1).As shown in Table 1,the T1D ancestry observed for this locus is confidently greater for NAT,at the expense of the two other ancestries.It is also apparent that AFR ancestry at this locus contributes to late onset of the disease(Table 4).Such results,in turn,should be taken with caution since this AIM does not clearly discriminate between EUR and AFR(Supplementary Table S1).Therefore,we cannot rule out the possibility that this effect is of European origin.
The second striking finding involves chromosomal region10p11.22(geneNRP1)[40].Although the opposite ancestry contributions between T1D and CLM are evident(Table 1),it is worth keeping in mind that the only AIM(rs3123687)used for this locus is highly informative for AFR and non-AFR ancestries(i.e.,either EUR or NAT).Given this information,we are aware that the conclusion regarding greater NAT contribution in our study could eventually go towards greater EUR ancestry.Therefore we can only tell that the difference observed is non-AFR,but are not able to define whether it is European or NAT.
The actual SNPs reported as associated with disease in these two genes(IL7RandNRP1)have not yet been tested in the sample presented here.However,a test of association using the AIMs analyzed here,after adjusting for the admixture effect,revealed that AIMrs700164associates with affected status(5.56×10-6,Supplementary Table S2 and Table 2)and that similarlyrs3123687strongly associates with the disease(P=8.07×10-19,Supplementary Table S2 and Table 2)forIL7RandNRP1genes,respectively.A verification of this finding should be performed using the transmission disequilibrium test(TDT).The TDT is not susceptible to population structure issues,such as admixture.This analysis is to be done for the actual SNPs,as the parents for the patients presented here are available.Such association analyses should include choosing gene variants from the genetic variability in this set of patients,and should also consider the LD blocks observed in this population.
No ancestry differences were found overall when comparing T1AD to idiopathic(T1BD)(Table 3).T1AD,whose etiology and pathology are better characterized,has ahigher incidence in Europe[6];on the contrary,T1BD is reported mainly in AFR and Asian countries[26].Our results are different from those by Piñero-Piloñaet al[41],who reported a high incidence of T1BD in Mexican patients,whose predominant ancestral component was NAT.Our cohort presents a majority of autoimmune cases(78%)and,as described here,their predominant ancestry is of European contribution.
Table 4 Genetic ancestry for type 1 diabetes patients stratified according to age at onset
However,looking at chromosomal regions along the analysis stratified by age at onset of T1D,we found that patients with a late onset of the disease have a greater AFR component,which was more marked on chromosome 5(Table 4).This suggests that AFR ancestry could be a risk factor for developing the disease at a late age in our population(over 2/3 of the sample had age at onset>5 years),which can modify the metabolic phenotype of patients,and influence the risk of late complications of diabetes[42].
Our study has an important limitation regarding the number and location of the AIMs.Thus,chromosomes 5 and 10 were tested with just a few such markers.It will be worth testing more AIMs nearby these two loci to further examine the differences revealed.Also,the reference population we used(CLM from the 1,000 genome database),although supposedly unaffected and older than our patients,were typed by a different method from the one we used to type our T1D cases.Nonetheless,both groups share comparable genetic ancestries.
Our study’s strength is its population choice.As described,the northwest Colombia population is the one with a greater European component in the country[15-19].Thus,our results make much more sense regarding the overall European contribution,together with the apparent unexplored NAT input to T1D,in addition to certain contributions of the AFR ancestry for late age at onset.
In conclusion,this study describes the ancestral genetic composition of 200 T1D patients from an admixed population from northwest Colombia.Consistently,we found a predominant proportion of European followed by NAT ancestry.No statistical difference was observed in the distribution of the proportions of ancestral genetic components between T1D patients and the CLM reference population.A variation in chromosomal segments derived from the parental populations was observed when comparing individuals with T1ADversusT1BD,and those who had an early(≤5 years)or late(>5 years)age at onset of the disease.These results demonstrate that the study of the genetic admixture provides new perspectives in the delineation of the genetic architecture underlying autoimmune diseases.Finally,performing a novel study in this sample,including unbiased distribution of AIMs through the whole genome,could help find undetected loci in previous studies,which would contribute to complete the T1D genetic architecture for our population.This will also contribute to making approaches,such as the polygenic risk score,become more accurate for these types of populations.
ARTICLE HIGHLIGHTS
Research background
Type 1 diabetes(T1D)is described as a disease predominantly in white populations.Subtypes of the disease are also more frequent in different ethnicities.Thus,the autoimmune form of the disease is observed more frequently in Caucasian countries,whilst the idiopathic form is more frequently observed in African and Asian countries.The patients included in this study are from Northwest Colombia.This is an admixed population originated by a three ethnic contribution.This population has been described as the most European in the country,followed by the Native American ancestry,and with its least significant component being African contribution.
Research motivation
In this study,we looked at the genetic ancestry of a set of 200 diseased subjects from Northwest Colombia.We were interested in describing whether their global ancestry,as well as some specific genomic regions,were of which particular ancestry.Only a few of these types of studies have been reported in Latin American populations,and none have occurred in Colombia.
Research objectives
We aimed at describing the ancestry composition of a cohort of Colombian patients with T1D.This description included both global analysis as well as specific tests on loci/genes previously related to the disease.
Research methods
We studied 200 diseased subjects from Northwest Colombia.We tested 75 admixture informative markers(AIMs)distributed through a set of previously reported genes(or chromosomal regions)associated with T1D.The disease was classified as either autoimmune or idiopathic in the study subjects.This was done by testing two disease-related auto-antibodies(AABs).If at least one such AAB was present,then the disease was classified as autoimmune.We also classified the age at onset of the disease as early(≤5 years)or late(>5 years).The reference population of Colombians living in Medellin(CLM)was compared to the set of patients presented here.We applied appropriate statistical tests given the non-normality of the data obtained.
Research results
Seventy eight percent of the patients presented at least one AAB.Over two thirds(69.5%)of the subjects developed the disease after 5-years-old.There were no significant differences between genders among the affected individuals.Seventy four AIMs were successfully tested(one failed the PCR optimization).It was observed that both the diseased and CLM groups were predominantly of European ancestry(61.58vs62.06),followed by Native American(24.30vs37.10)and African ancestries(10.28vs10.65).In addition,specific genes such asEFR3B,IFIH1,IL7RandNRP1displayed differential Native American or African rather than European contributions.In addition,we found that autoimmune patients displayed lower Native American ancestry than idiopathic cases.
Research conclusions
Our study shows that diseased individuals from Northwest Colombia are predominantly of European ancestry,followed by native American and African ancestries.Also,other European contributions were found for specific genes in our study.
Research perspectives
MHC is expected to play the strongest role in T1D susceptibility.However,this was not the observation in our study.Our results suggest that different loci effect sizes might be at play in our admix population.This is inferred from the observation of the significance strength observed forMHCancestry compared to other loci.Therefore,it would be worth testing AIMs in this sample(expanded with extra individuals from the same region in Colombia)throughout the whole genome.This way,it would be feasible to reveal differences in local ancestry either for known or unknown loci associated with T1D in our population.This would help complete the genetic architecture of the disease,particularly for our population.In turn,this would contribute to the knowledge of the disease biology,and would also make this sample population appropriate for applying approaches such as the polygenic risk score.
ACKNOWLEDGEMENTS
We are very grateful to the patients that participated in this study.We are also very grateful to Doctors Martin Toro,Maria Victoria Lopera,Jorge García and Alejandra Velez for contributing patients to this study.