APP下载

Pseudogenization of the Humanin gene is common in the mitochondrial DNA of many vertebrates

2017-08-24IanLogan

Zoological Research 2017年4期

Ian S. Logan



Pseudogenization of thegene is common in the mitochondrial DNA of many vertebrates

Ian S. Logan*

22 Parkside Drive, Exmouth, Devon, UK

In the human the peptide Humanin is produced from the smallgene which is embedded as a gene-within-a-gene in the 16S ribosomal molecule of the mitochondrial DNA (mtDNA). The peptide itself appears to be significant in the prevention of cell death in many tissues and improve cognition in animal models. By using simple data mining techniques, it is possible to show that 99.4% of the humansequences in the GenBank database are unaffected by mutations. However, in other vertebrates, pseudogenization of thegene is a common feature; occurring apparently randomly in some species and not others. The persistence, or loss, of a functionalgene may be an important factor in laboratory animals, especially if they are being used as animal models in studies of Alzheimer’s disease (AD). The exact reason whyunderwent pseudogenization in some vertebrate species during their evolution remains to be determined. This study was originally planned to review the available information aboutand it was a surprise to be able to show that pseudogenization has occurred in a gene in the mtDNA and is not restricted solely to chromosomal genes.

mtDNA; Humanin; Pseudogenization; NUMT

INTRODUCTION

The peptide Humanin was first described in 2001 (Hashimoto et al., 2001) when it was observed during a study on Alzheimer’s disease (AD) that the death of cells was prevented by the presence of the peptide. A further paper from the same research group (Terashita et al., 2003) described how the sequence of the peptide appeared to be produced by a hitherto unrecognised gene-within-a-gene in thegene of the human mitochondrial DNA (mtDNA). Subsequently, Bodzioch et al. (2009) described that thegene was responsible for the production of Humanin, but also suggested that the peptide might be produced from chromosomal DNA as sequences very similar to mitochondrialcould be found in the fragmentary copies of the mtDNA that exist in nuclear chromosomal DNA. This report also mentioned that sequences similar to thegene, as found in the human, are ‘Recent reports have not really settled the point as to whether Humanin is produced by thegene, the nuclear mitochondrial DNA segment (NUMT) copies, or both, but have concentrated more on the actions of Humanin as a protective agent, especially in AD(Cohen et al., 2015; Hashimoto et al., 2013;Matsuoka, 2015; Tajima et al., 2002), as a retrograde signal peptide passing information about the mitochondrion to the rest of the cell (Lee et al., 2013) and as an agent improving cognition(Murakami et al., 2017; Wu et al., 2017).Synthetic Humanin can now be purchased for research purposes from several companies, and is available with the standard amino acid sequence, or with the 14thamino acid altered from Serine to Glycine, the S14G form, a change which appears to increase the potency of the peptide (Li et al., 2013).

Mitochondria are small organelles found in all developing eukaryotic cells. Each mitochondrion contains a few small rings of double-stranded DNA. Human mtDNA is described as containing 16569 numbered nucleotide bases, a fairly typical number for a vertebrate, and was the first mitochondrial DNA molecule to be fully sequenced (Anderson et al., 1981). This complete mtDNA sequence was later updatedto eliminate the errors (revised Cambridge Reference Sequence [rCRS]; Andrews et al., 1999; Bandelt et al., 2014). The GenBank database (Benson et al., 2005) now holds over 36000 human mtDNA sequences submitted by research institutes and some private individuals, as well as about 11000 mtDNA sequences from other vertebrates. The rCRS (GenBank Accession No. NC_012920) describes the mtDNA as having 13 genes for peptides that form part of the OXPHOS system, 22 genes for the production of transfer RNA sequences, a hypervariable control region, several small non-coding sequences situated between the other parts, and most importantly for the present discussion, two genes for the production of RNA sequences (for the RNA 12S molecule &for the RNA 16S molecule) which are structural components used in the building of ribosomes. Thegene is located at region 2633–2705 in the rCRS and encodes a 24 amino acid peptide. But as this is in the centre of thegene, the nucleotide bases of thegene also have the separate function of being a part of the ribosomal RNA 16S molecule.

In this paper simple mining techniques (Yao et al., 2009; Zaki et al., 2007)have been used to look at thegene in the mitochondrial sequences of the human and other vertebrates available in the GenBank database. The study showed that pseudogenization of thegene does not occur in the human, but is a common featurein other vertebrates. Moreover, the study shows the surprise finding that pseudogenization has occurred in a gene in the mtDNA and is not restricted solely to chromosomal genes.

DATASETAND METHODS

The mitochondrial sequences held in the GenBank database formed the dataset for this study. This database holds about 36000 different human mtDNA sequences.The individual page on the database for each sequence can be found using a direct link of the form: https://www.ncbi.nlm.nih.gov/nuccore/NC_012920. This particular link connects to the page for the rCRS; and the page for any other sequence can be found by replacing NC_012920 with another Accession No..

A list of the human mtDNA sequences can be found by searching the GenBank database with a query string such as: "homo sapiens" [organism] "complete genome" mitochondrion. Although this list should contain the details of different mtDNA sequences, note in particular that many sequences from the Human Diversity Genome Project are duplicated and in some instances triplicated.

The GenBank database also contains the mitochondrial sequences for approximately 11000 other vertebrate samples. About 4000 of these sequences are described as Reference Sequences and have Accession No. in the range NC_000000–NC_999999 (Pruitt et al., 2007). Each Reference Sequence comes from a different species, so the mtDNA sequences available on the GenBank database can be considered as coming from about 4000 different species of vertebrate.

Unfortunately, GenBank does not give details of the parts of thegene in the description accompanying a mtDNA sequence, so it is necessary to identify thesequence by searching for it. However, as thegene is well conserved throughout vertebrates the sequences are able to be identified fairly easily.

Initially, the sequences described in this study were found by visual examination of the FASTA file for each sequence, but subsequently a pattern matching computer program (in Javascript) was developed (Supplementary Program 1, available online). This program requires as its input thegene in FASTA format. The 73 bases of thegene as given in the rCRS are then compared by stepping-along thegene; and a best-fit is found. A non-matching comparison typically finds about 30 bases in common (i.e., about 40%), but asequence will match about 50 bases (i.e., about 70%), even for a distant species, and much better for other mammalian species.

The results of this study are presented in three parts: firstly, the variants found in the 36000+ human mtDNA sequences available for study in the GenBank database;secondly, thesequences found in other vertebrates; and thirdly, the NUMT sequences found in the human and some non-human species.

For clarity, the mtDNA variants are listed in a format of “rCRS allele position derived allele”, instead of the proposed nomenclature of the HVGS (http://varnomen.hgvs.org/). For instance, mtDNA variant T2638C may also be written as m.2638T>C according to HVGS nomenclature, and protein variant P3S written as p.P3S.

RESULTS

Variants identified in the humanmtDNA sequences

The nucleotide sequence for thegene in the rCRS (GenBank Accession No. NC_012920) was predicted to encode a peptide of 24 amino acid residues, together with a stop codon (MAPRGFSCLLLLTSEIDLPVKRRAX), and this sequence was found in 99.4 % of all the human mtDNA sequences. Data mining of all the human mtDNA genomes in GenBank showed 17 variants are known so far, of which five variants are associated with different mitochondrial haplogroups A2f1 (T2638C,=17, no change in the amino acid sequence), N1b (C2639T,=96, this variant leads to an amino acid change P3S), U6a7a1a (A2672G,=16, this variant changes the amino acid sequence at S14G and produces the well-known potent form of Humanin), G2701A (H13a1a2b,=8, no change in the amino acid sequence), N1a1a (G2702A,=71, this variant changes the amino acid sequence at A24T); and are common enough for them to be used as defining variants in the phylogenetic tree (van Oven & Kayser, 2009) (Supplementary Table 1, available online).

The other 12 variants all occur at low frequencies (1–6 sequences each); and in many of the sequences the presence of a mutation should be considered as ‘unverified’.An expanded table of these results is given in theSupplementary Table 1 (available online).

Thesequences found in non-human vertebrates

Data mining of the non-human vertebrate mtDNA sequences showedsequences are present in all vertebrates. However, many of the sequences show pseudogenization and these sequences are not able to produce functional peptides. A pseudogene is recognizable because of the absence of a start codon, the presence of a premature stop codon, or a deletion/insertion causing a frameshift in the sequence.

Amongst our closest relatives are chimpanzees () and gorillas (), and theirsequences are able to produce functional peptide, albeit the sequences differ from that of human by one residue (Figure 1).Some other small mammals showed they should be able to produce functional peptide, e.g., the Guinea pig () and Northern tree shrew () (Yao, 2017). However, it was unexpected to find that the Macaques () show pseudogenizationas there is loss of the start codon.

Figure 1 Alignment of Humanin peptide sequences

Residues that differ from that of the human are marked in red. The change in a start codon from Methionine is marked with a box. For sequence information, please refer to SupplementaryTable 1 (available online).

Most rat and mouse species were shown to have functional genes, but importantlysome common mice species (e.g., House mouse ()) were found to have undergone pseudogenization.Other common mammals, such as the cat () and the goat (), were also shown to have undergone pseudogenization.Amongst the bony fishes, the zebrafish () appeared able to produce Humanin,but many other fishes, such as'Triglidae' cannot.Some other vertebrate classes may also be able to produce Humanin, such as the lizards () and birds (), but whether the peptides are fully functional is uncertain. The most distant vertebrates from ourselves are the lungfish (), hagfish () and lampreys ();and whereas hagfish and lampreys show pseudogenization of theirsequences, the lungfishes may have retained the ability to make a functional peptide.

The Tuatara (), an ancestor of the snake and found only in New Zealand (Subramanian et al., 2015) is also shown to have a functional gene. But in this species the peptide is one amino acid longer; and it is possible that the gene underwent pseudogenization, only for a later deletion to make the gene functional once again.

SupplementaryTable 2 (available online) describes thegene in 80 non-human vertebrate species. It is noteworthy that the ability to make the enhanced S14G form of Humanin was identified only in human sequences.

NUMT sequences in the human, Rhesus macaque, mouse and Golden hamster

As mentioned earlier, the question as to whether NUMT sequences are used to make Humanin is not resolved. But this data mining study showed that in the human there are two NUMT sequences that appear identical to thegene as found in the mtDNA, and a third sequence which is only altered at one amino acid.Also, in the Rhesus macaque (), the mouse and the Golden hampster (), thegene was shown to have undergone pseudogenization. However, it was also found that a possibly functional NUMT can be found in the chromosomes of each species, suggesting that the pseudogenization may have been a reasonably recent event in the evolutionary past of these species. The details of these sequences are given in the SupplementaryTable 3 (available online).

DISCUSSION

Most of the recent papers dealing with the Humanin peptide have centred on its ability to act as a protective agent, especially in AD (Hashimoto et al., 2001), a retrograde signal peptide (Lee et al., 2013), or as an agent improving cognition(Murakami et al., 2017; Wu et al., 2017), but little has been said about the underlying biology of thegene. In this paper some of the basic points abouthave been examined by looking at the available sequences of human mtDNA and that of many other vertebrates found in the GenBank database.

In the mtDNA of vertebrates thegene is a gene-within-a-gene as it can be considered as a normal DNA gene and as such has a start-codon, a number of codons which code for amino acids, and finally a stop-codon, which is often represented by a single Thymine nucleotide. However, the bases of thegene are also part of a large RNA structure that is used in the building of ribosomes. Therefore, thegene is both a DNA gene and a RNA gene and as such it can be expected there will be evolutionary pressure to maintain the gene so that it continues to function successfully in both forms. The presumption therefore is that the nucleotide bases of thegene will respond to this evolutionary pressure by showing a low mutation rate. Indeed, the results presented here do suggest that thegene has been strongly conserved throughout vertebrate evolution, and has continued to be functional, for example, in both the human and the lamprey, which have an evolutionary period of separation of over 360 million years (Xu et al., 2016).

In the human, for which there were over 36000 mtDNA sequences available for study, it appeared that 99.4% of the sequences did not show any mutational differences from that of the rCRS, but in 230 sequences mutations were observed. However, it is likely that the peptide functions normally in most people with mutations, e.g., the mutation C2639T, which is found in people of Haplogroup N1b, changes the 3rdamino acid from a proline to a serine, and a change this close to the end of a peptide usually has little effect. At the other end of the peptide the mutation G2702A changes the 24thamino acid from a glycine to an adenine and similarly can be expected again to have little effect. Indeed, changes to the 3rdand 24thamino acids feature prominently in the sequences from other vertebrates.

However of much greater significance is the mutation A2672G, which causes the change of the 14thamino acid from a serine to a glycine. This change S14G is considered to increase the potency of the peptide (Li et al., 2013), and it appears that a few people in the world are able to produce this special form of Humanin naturally. Interestingly, this mutation for the most part is associated with the Haplogroup U6a7a1a, which contains members of an extended Acadian family found in Canada (Secher et al., 2014).

As for other vertebrates, there are mtDNA sequences from over 4000 species in the GenBank database, and it has been possible to identify thegene unambiguously in all the sequences examined. However, and unexpectedly, many species show pseudogenization of thegene. For example, the human, chimpanzees, gorillas and many other monkeys do have functional genes, but the macaque monkeys, which are widely used in research, do not. Rats, guinea pigs and the northern tree shrew appear able to produce humanin, but some mice, cats and goats do not. So overall, it appears that there is a somewhat random pattern to the evolutionary success of thegene, with it surviving in some species, while undergoing pseudogenization in other apparently closely related species.

However, as with any apparently random pattern, there are features that may well be worthwhile considering further. In the case of thegene there does not appear to be any overt link to the size of an animal, it longevity or its innate intelligence. But it would seem that the presence of a potentially functionalgene is commonly found in primates and birds. However, whether this indicates an evolutionary conserved advantage, or is just a feature of the short evolutionary history of these two groups remains to be determined.

The term pseudogene was used for the first time 40 years ago (Jacq et al., 1977) in a study looking at the genome of the African clawed toad, when it was found the genome had multiple copies of a gene. These copies were considered to have no function and were termed pseudogenes. It has since been shown that the genomes of vertebrates contain many different types of pseudogene (Mighellet al., 2000), of which duplication of genes and NUMTs are two of the types. Many pseudogenes are so fragmentary or degraded by subsequent mutations that they are clearly non-functional, but as mentioned earlier, it is possible that some chromosomal copies of thegene are the exception and might still be expressed. Another type of pseudogene formation occurs when a gene is affected by a mutation, or some other process, so that it stops functioning and this process is called pseudogenization. Typically this will occur as the result of loss of the start codon or the introduction of a premature stop codon. Duplication of a gene, and the subsequent loss of functioning of one copy forms another type of pseudogene, but this does not appear to apply to thegene, at least in the human genome (Stark et al., 2017).

The pseudogenization of thegene as detailed here may not solely be an interesting point of evolution, but may also have a significance in studies that use animal models. There have been many studies looking at the effect of Humanin in disease, in particular in AD, which have used animal models, and the results obtained from studies may well have been affected by whether the animals had functional copies of thegene, or pseudogenes. Also, research to find new animal models for many diseases is an important field (Xiao et al., 2017; Yao, 2017), and it would now seem that sequencing of the mtDNA, and in particular thegene, should become routine in any animal under consideration.

Overall, there is still much to be discovered about Humanin. In our own species, and in our close relatives, the peptide appears to be useful; and has been preserved. However, many other species do appear to have kept the ability to produce a Humanin-like peptide, but whether the peptide have the same function as in the human is as yet unknown. This study has also shown there are the many species where pseudogenization has taken place and thepseudogene continues as a sequence of nucleotide bases that form part of the structure of ribosomes.

Why thegene has undergone pseudogenization in some species has not been determined. But it would appear that in many species there has been little evolutionary pressure to preserve their ability to make Humanin; and it is only in some species, including ourselves, that it appears to give some evolutionary advantage.

This paper has looked atby reviewing the evidence about the gene available from several sources, and suggests a number of places where experimental work is needed to confirm the findings. It is also possible that future practical work may show that this peptide, in its original or in a synthetic form, may have a therapeutic use in the treatment of conditions such as Alzheimer’s disease.

ACKNOWLEDGEMENTS

The author is very grateful to Yong-Gang Yao (Kunming Institute of Zoology, CAS) for his many helpful comments, especially those concerning the presentation of the material in this paper.

Anderson S, Bankier AT, Barrell BG, de Bruijn MHL, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJH, Staden R, Young IG. 1981. Sequence and organization of the human mitochondrial genome., 290(5806): 457-465.

Andrews RM, Kubacka I, Chinnery PF, Lightowlers RN, Turnbull, DM, Howell N. 1999. Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA., 23(2): 147.

Bandelt HJ, Kloss-Brandstätter A, Richards MB, Yao YG, Logan I. 2014. The case for the continuing use of the revised Cambridge Reference Sequence (rCRS) and the standardization of notation in human mitochondrial DNA studies., 59(2): 66-77.

Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. 2005. GenBank.33(S1): D34-D38.

Bodzioch M, Lapicka-Bodzioch K, Zapala B, Kamysz W, Kiec-Wilk B, Dembinska-Kiec A. 2009. Evidence for potential functionality of nuclearly-encoded humanin isoforms., 94(4): 247-256.

Cohen A, Lerner-Yardeni J, Meridor D, Kasher R, Nathan I, Parola AH. 2015. Humanin Derivatives Inhibit Necrotic Cell Death in Neurons., 21(1): 505-514.

Hashimoto Y, Niikura T, Tajima H, Yasukawa T, Sudo H, Ito Y, Kita Y, Kawasumi M, Kouyama K, Doyu M, Sobue G, Koide T, Tsuji S, Lang J, Kurokawa K, Nishimoto I. 2001. A rescue factor abolishing neuronal cell death by a wide spectrum of familial Alzheimer’s disease genes and Aβ., 98(11), 6336–6341.

Hashimoto Y, Nawa M, Kurita M, Tokizawa M, Iwamatsu A, Matsuoka M. 2013. Secreted calmodulin-like skin protein inhibits neuronal death in cell-based Alzheimer's disease models via the heterotrimeric Humanin receptor., 4(3): e555.

Jacq C, Miller JR, Brownlee GG. 1977. A pseudogene structure in 5S DNA of Xenopus laevis., 12(1): 109-120.

Lee C, Yen K, Cohen P. 2013. Humanin: a harbinger of mitochondrial-derived peptides?, 24(5): 222-228.

Li X, Zhao WC, Yang HQ, Zhang JH, Ma JJ. 2013. S14G-humanin restored cellular homeostasis disturbed by amyloid-beta protein.2013. 8(27): 2573-2780.

Matsuoka M. 2015. Protective effects of Humanin and calmodulin-like skin protein in Alzheimer's disease and broad range of abnormalities., 51(3): 1232-1239.

Mighell AJ, Smith NR, Robinson PA, Markham AF. 2000. Vertebrate pseudogenes., 468(2-3): 109-114.

Murakami M, Nagahama M, Maruyama T, Niikura T. 2017. Humaninameliorates diazepam-induced memory deficit in mice., 62: 65-70.

Pruitt KD, Tatusova T, Maglott DR. 2007. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins., 35(Database issue): D61- D65.

Romeo M, Stravalaci M, Beeg M, Rossi A, Fiordaliso F, Corbelli A, Salmona M, Gobbi M, Cagnotto A, Diomede L. 2017. Humanin Specifically Interacts with Amyloid-β Oligomers and Counteracts Their in vivo Toxicity., 57(3): 857-871.

Secher B, Fregel R, Larruga JM, Cabrera VM, Endicott P, Pestano JJ, González AM. 2014. The history of the North African mitochondrial DNA haplogroup U6 gene flow into the African, Eurasian and American continents., 14: 109.

Stark TL, Liberles DA, Holland BR, O'Reilly MM. 2017. Analysis of a mechanistic Markov model for gene duplicates evolving under subfunctionalization.17(1): 38.

Subramanian S, Mohandesan E, Millar CD, Lambert DM. 2015. Distance-dependent patterns of molecular divergences in Tuatara mitogenomes.. 5: 8703.

Tajima H, Niikura T, Hashimoto Y, Ito Y, Kita Y, Terashita K, Yamazaki K, Koto A, Aiso S, Nishimoto I. 2002. Evidence for in vivo production of Humanin peptide, a neuroprotective factor against Alzheimer's disease-related insults., 324(3): 227-231.

Terashita K, Hashimoto Y, Niikura T, Tajima H, Yamagishi Y, Ishizaka M, Kawasumi M, Chiba T, Kanekura K, Yamada M, Nawa M, Kita Y, Aiso S, Nishimoto I. 2003. Two serine residues distinctly regulate the rescue function of Humanin, an inhibiting factor of Alzheimer's disease-related neurotoxicity: functional potentiation by isomerization and dimerization.85(6): 1521-1538.

van Oven M, Kayser M. 2009. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation., 30(2): E386-E394.

Wu M, Shi H, He YX, Yuan L, Qu XS, Zhang J, Wang ZJ, Cai HY, Qi JS. 2017 Colivelin Ameliorates Impairments in Cognitive Behaviors and Synaptic Plasticity in APP/PS1 Transgenic Mice., 59(3): 1067-1078.

Xiao J, Liu R, Chen CS. 2017. Tree shrew (Tupaia belangeri) as a novel non-human primate laboratory disease animal model., 38(3): 127-137.

Xu Y, Zhu SW, Li QW. 2016. Lamprey: a model for vertebrate evolutionary research.37(5): 263-269.

Yao YG, Salas A, Logan I, Bandelt HJ. 2009. mtDNA data mining in GenBank needs surveying., 85(6): 929-933.

Yao, YG. 2017. Creating animal models, why not use the Chinese tree shrew (Tupaia belangeri chinensis)?, 38 (3): 118-126.

Zaki MJ, Karypis G, Yang J. 2007. Data Mining in Bioinformatics (BIOKDD)., 2: 4.

20May 2017; Accepted: 05 July 2017

, E-mail: ianlogan22@btinternet.com

10.24272/j.issn.2095-8137.2017.049