APP下载

Genomic integrity of human induced pluripotent stem cells:Reprogramming, differentiation and applications

2019-12-22ClaraSteichenZaraHannounElanorLuceThierryHauetAnneDubartKupperschmitt

World Journal of Stem Cells 2019年10期

Clara Steichen, Zara Hannoun, Eléanor Luce, Thierry Hauet, Anne Dubart-Kupperschmitt

Clara Steichen, Thierry Hauet, INSERM U1082 IRTOMIT, CHU de Poitiers, Poitiers F-86021,France

Clara Steichen, Thierry Hauet, Université de Poitiers, Faculté de Médecine et Pharmacie,Bâtiment D1, 6 rue de la milétrie, TSA 51115, 86073 Poitiers Cedex 9, France

Zara Hannoun, Eléanor Luce, Anne Dubart-Kupperschmitt, INSERM U1193, Hôpital Paul Brousse, Villejuif F-94800, France

Zara Hannoun, Eléanor Luce, Anne Dubart-Kupperschmitt, UMR_S1193, Université Paris-Saclay, Hôpital Paul Brousse, Villejuif F-94800, France

Zara Hannoun, The Jenner Institute, University of Oxford, Oxford OX3 7DQ, United Kingdom

Eléanor Luce, Anne Dubart-Kupperschmitt, Département Hospitalo-Universitaire Hepatinov,Hôpital Paul Brousse, Villejuif F-94807, France

Thierry Hauet, Service de Biochimie, Pôle Biospharm, CHU de Poitiers, Poitiers F-86021,France

Thierry Hauet, Fédération Hospitalo-Universitaire SUPORT, CHU de Poitiers, Poitiers F-86021, France

Abstract

Key words: Induced pluripotent stem cells; Genomic integrity; Mutations; Karyotype;Differentiation; Cell therapy; Quality control; Reprogramming

INTRODUCTION

Human induced pluripotent stem cells (hiPSCs) are artificial cells generated through complex genetic and epigenetic reprogramming of cultured somatic cells.They are close to human embryonic stem cells (hESCs) regarding their pluripotency, infinite self-renewal capacity but also when focusing on their genomic integrity.The first section of this review will describe the different types of genomic abnormalities reported in hiPSCs, ranging from large mutations involving wide karyotype alterations to single nucleotide mutations.Then, we will focus on the reprogramming process and its impact on the iPSC genome to discuss if reprogramming parameters can be adapted to minimize their effects on the cell genome.In a third part, we will focus on how iPSC genomic integrity is affected by both iPSC long term culture but also differentiation.Finally, the impact of genomic alterations on the possible usages of hiPSCs and their derivatives will be discussed as well as the necessary quality controls that need to be performed.

MATERIAL AND METHODS

This review is based on systematic research on PubMed (http://www.ncbi.nlm.nih.gov/pubmed) using the following research keywords (either used separately or in combination): “induced pluripotent stem cells, genomic integrity, and genomic stability”.We prioritized the articles focusing on hiPSCs, and included those focusing on either hESCs or pluripotent stem cells from other species, including mouse, if they were particularly relevant in the context.We apologize for all the articles that we could not cite due to word limitations.

TYPE OF GENOMIC ABNORMALITIES OBSERVED IN HIPSCS: AN UPDATE

Karyotype aberrations

Although some cell lines maintain a normal karyotype after long-term culture,hiPSCs, like hESCs, present a propensity towards genomic instability.Based on karyotype analysis using G-banding, a number of hiPSC lines present aneuploidies;including recurrent ones mainly acquired during long term culture such as trisomy of chromosome 12, 17 or X or amplification of specific locus.These abnormalities have been extensively reviewed[1-5].It is now accepted that these chromosomal and subchromosomal aberrations are common features of both hESCs and hiPSCs, however it is unclear if this trend is a specific feature associated with pluripotency or whether it would also be observed in non-pluripotent cells hypothetically maintained in long term culture if possible.In addition to the more commonly known chromosomal abnormalities, an aberration known as uniparental disomy (UPD) has been reported in iPSCs[6,7].UPD occurs when a daughter cell inherits two copies of a chromosome (or part of a chromosome) from one parental cell and no copy of the corresponding chromosome (or part of the chromosome) from the other parental cell.UPD can be associated with various clinical symptoms: it may lead to the acquisition of homozygosity of a recessive allele involved in a genetic disorder or to an imbalance of paternal versus maternal epigenetic information which may lead to dysfunction in case of the presence of imprinted genes[8].For example, UPD of chromosome 15 leads either to Angelman Syndrome (if both copies of a section of chromosome 15 are obtained from the father) or to Prader-Willy Syndrome (if both copies are obtained from the mother), both serious developmental disorders.UPD has never been reported in the context of hiPSCs until recently when Bershteynet al[6]used fibroblasts from patients affected with Miller Dieker Syndrome, a genetic disease characterized by the presence of a ring chromosome 17 and linked with congenital malformations.Authors reprogrammed these fibroblasts into hiPSCs using episomal vectors.Surprisingly, they showed that multiple hiPSC lines generated from these fibroblasts do not contain the ring chromosome.This was explained by the non-participation of the ring chromosome during reprogramming, leading to UPD of the whole chromosome 17.They showed the cell-autonomous correction of a ring chromosomal aberration via compensatory UPD by iPSC generation, opening the door to chromosome therapy using iPSCs.We also described within our laboratory that UPD could also be observed in hiPSCs in a non-compensatory context.One of the iPSC lines generated by repeated transfections using home-made mRNAs of normal human foreskin fibroblasts presented a complex and abnormal karyotype as well as a large region of UPD on the chromosome 1q[7].Interestingly, we showed that despite the normal behavior exhibitedin vitroin terms of stemness marker expression and the differentiation into cells from all three germ layers, this iPSC line exhibited an abolished ability to form teratomain vivo.The potential link between these genomic rearrangements and this feature has not yet been elucidated.However, this observation demonstrated that UPD can also occur in hiPSCs in a non-compensatory context, even using a non-integrative reprogramming strategy.Moreover, this work highlights the importance of performing single nucleotide polymorphism (SNP)genotyping among the methods used for the quality control of hiPSC genomic integrity because UPD can only be detected by this method enabling an accurate detection of the regions with consecutive loci with loss of heterozygosity (LOH).

Copy number variations

Copy number variations (CNVs) are variations in the number of copies of DNA sections, consisting of either genomic sequence deletions or amplifications.The occurrence of CVNs in human pluripotent stem cells (hPSCs) was first highlighted in 2011.Laurentet al[9]performed an extensive analysis of 324 samples using highresolution SNP genotyping.These samples included 37 hiPSC lines, 69 hESC lines,and non-pluripotent somatic cell lines or primary cell lines.The authors show that the number of CNVs in hiPSCs was significantly higher when compared to nonpluripotent samples.These results were confirmed in another study performed by Husseinet al[10]which analyzed 22 hiPSCs showing a higher level of CNV in hiPSC lines when compared to fibroblasts or hESC lines.The repartition of these CNVs is not random and they frequently affect common fragile sites or sub-telomeric regions,which can both be particularly sensitive to DNA double strand breaks.Two hypotheses may explain the presence of the high level of CNV in hiPSCs compared to hESCs or human somatic cell samples; either they are acquiredde novoduring the reprogramming process or they are pre-existing in the initial somatic cell population and are amplified or selected through reprogramming and subsequent culturing.

Single point mutations

Karyotyping, SNP genotyping or comparative genomic hybridization (CGH)-array analyses are techniques used to detect deletions or duplications in large parts of the genome, whereby each system has a specific detection limit (minimal size of a CNV detected) and resolution (genome coverage).However, these techniques are unable to detect single point mutations, which can only be observed using sequencing.Through whole exome sequencing, Goreet al[11]analyzed the presence of single point mutations in 22 hiPSC lines and the 9 fibroblast populations they were derived from.The authors show that each iPSC line contained an average of 6 protein-coding mutations(i.e., mutations in a coding region of the genome).The results have been confirmed by others; demonstrating the presence of between 6 and 12 single-nucleotide mutations of each human iPSC genome[12,13].As noted with CNVs, there are two possibilities regarding the origin of these mutations: Are they preexisting in the initial population before reprogramming or acquired during reprogramming? The correct answer is most likely a combination of both, whereby its importance will be discussed in depth in section IIA.As such, another question remains; do these mutations offer a selective advantage for reprogramming or are they randomly amplified? Despite numerous debates, there is yet no consensus in the field and these two hypotheses are not mutually exclusive.On the one hand, various authors suggest that selection is possible as specific mutations have been found in at least 2 iPSC lines derived from the same fibroblast population or because these mutations frequently involve specific pathways[11,14].On the other hand, other studies were unable to detect such ‘shared’mutations and therefore do not support this hypothesis[12].

THE IMPACT OF REPROGRAMMING: FINDING THE KEY TO GENOMIC INTEGRITY?

The existence of mutations in iPSCs is currently well established.However, the subsequent challenge is to try to understand whether such genomic abnormalities could be reduced or minimized.Several aspects have been highlighted as potential factors involved in maintaining or compromising the iPSC genome integrity and will be discussed in the following section.

The importance of somatic mosaicism; choosing the right cell type to reprogram

The importance of somatic mosaicism (the coexistence of cells with different genotypes in a cell population) in the context of iPSCs has been demonstrated in a study focusing on Down syndrome [resulting from chromosome 21 trisomy (Ts21)].In rare cases (1%-3% of patients), patients are mosaic for this mutation whereby only a percentage of their cells carry the trisomy.In this study, authors used mosaic patient’s fibroblast population with 90% of the cells carrying the Ts21, whereas the remaining 10% of cells were euploid.They subsequently generated 3 iPSC lines using the fibroblast population and demonstrated, through fluorescencein situhybridization analysis, that two cells lines contained Ts21, whilst one cell line was euploid for chromosome 21, highlighting the clonogenic characteristic of reprogramming and its subsequent impact on iPSC genome[15].Authors also performed SNP analysis and excluded the possibility of UPD, which may have explained a trisomy rescue[15].

This example highlights the importance of considering somatic mosaicism as a crucial parameter to take into account when ensuring the maintenance of hiPSC genomic integrity, as iPSC generation involves the cloning and amplification of the genome of one unique cell.Somatic mosaicism accumulates during mitosis and is therefore acquired both during early development and during the normal aging process.It has been shown to affect various tissues such as skin, cerebellum, liver,intestine or digestive tract, and depends on the tissue self-renewal rate and exposure to environmental stress such as ultraviolet radiation[16,17]or endogenous mutagenic factors such as transposable elements[18].Since such events accumulate with ageing,donor age has been shown recently to be associated with an increased risk of abnormalities in iPSCs[19].The definition of somatic mosaicism also includes genomic alterations of varying size, ranging from chromosome gains or losses to single nucleotide substitutions.A number of studies have focused on the genomic integrity of iPSCs, highlighting the contribution of somatic mosaicism, either through the acquisition of CNVs or single point mutations.Abyzovet al[20]analyzed 20 hiPSC lines generated from 7 different fibroblast populations.They showed that each iPSC line contained an average number of 2 CNV (< 10 kb).Using both polymerase chain reaction (PCR) performed across CNV breakpoints and droplet digital PCR, the authors illustrate that at least 50% of the CNVs detected in the hiPSC lines were present at a very low frequency in the original fibroblast population; and therefore can be explained by somatic mosaicism.It should be noted that the value obtained(50%) may be an underestimation, depending on the detection level of the technique used and the quantitative contribution of the CNV[20].The authors analyzed the 7 populations of fibroblasts and showed that 30% of them contained CNVs when compared to a human genome reference sequence such as hGRC37 sequence,highlighting a high degree of somatic mosaicism in fibroblasts.Investigations focusing on single point mutations, specifically protein-coding mutations, have also underlined the contribution of somatic mosaicism in iPSC line genetic abnormalities;however the quantitative estimation differs from one study to another.One study describes a total average number of 6 protein-coding mutations per hiPSC genome and the authors then quantified the frequencies of these mutations in the corresponding fibroblast lines using ultra deep sequencing and showed that approximately 53% of the mutations were found in the original fibroblast lines;ranging from 0.3-1000 in 10000[11].These conclusions have been further supported by another study showing that at least 17% of protein-coding mutations in hiPSCs can be detected in the originating fibroblast population[13].Moreover, using Next Generation Sequencing on both iPSC clones and fibroblast subclones they were derived from,Kwonet al[21]highlighted that only a small number of variants remained undetectable in the parental fibroblasts.This data has also been reinforced in the mouse model through a study demonstrating that different murine iPSC lines share SNP variants;therefore suggesting that these mutations are present in a subpopulation of the fibroblasts[14].The existence of somatic mosaicism also poses the question of whether it is necessary to generate isogenic controls when using iPSCs for disease modeling.To date, “normal” iPSCs, cells derived from an unaffected individual, are often taken as a control for pathological iPSCs.However, considering the importance of the genetic background of each iPSC line, the optimal control would be an isogenic iPSC line.These cell lines can either be generated by specifically targeting the mutation in the affected iPSC line using recently developed genomic editing strategies (CRISPR/Cas9 or TALENs)[22]or be generated by chance; as reported in the case of the Down Syndrome study where the euploid derived iPSC line could be used as the optimal isogenic control to study the physiopathology of the disease[15].

Based on these conclusions, the next question to address is “Can we reduce the contribution of somatic mosaicism by using a specific cell type, and thus improve the control of genomic integrity in iPSCs?” Unfortunately, the answer is still not clear.Various cell types have been shown to be susceptible to reprogramming including fibroblasts, keratinocytes, mesenchymal stem cells, blood cells, hepatocytesetc.However, little is known about the extent of somatic mosaicism in the different cell types and comparative genomic analyses are not currently available.However, two cell types, blood-derived cells and urine-derived cells, have been found to be suitable cells for use in reprogramming, with an added advantage of being easily obtained.In a first article, authors isolated endothelial progenitor cells (EPCs) from peripheral blood followed by successful reprogramming into iPSCs using retroviral vectors.The team performed karyotype and CGH-array analyses and did not detect any genomic abnormalities were detected in 9 of the 11 EPC-iPSC lines.The remaining two EPC-iPSC lines were shown to have one copy gain of 36.6 and 632.7 kb in size,respectively[23].Previous studies have shown that higher numbers of CNVs are detected when using fibroblasts as the initial substrate from reprogramming[10,24,25],therefore the authors suggest that EPCs, which are easily isolated and present a relative immature phenotype, could be used to generate genetically healthy iPSCs.In another article, the authors reprogrammed human cord blood (CB) CD34+ cells using lentiviral vectors.Through whole exome sequencing analysis of 5 iPSC lines, an average of 1.3 coding mutations per iPSC line was detected, which is lower when compared to previous studies using the same analysis technique[26]although CB is not an optimal source based on accessibility in the context of personalized medicine.However, a direct comparison of both substrates in comparable conditions such as iPSCs generated in parallel with the same reprogramming methods, culture conditions, genomic analysis techniques and detection criteria, would be needed to confirm these results.Nagariaet al[27]showed that hiPSCs derived from CB myeloid progenitors closely resembled hESCs in DNA repair gene expression signature and irradiation-induced DNA damage response, relative to hiPSCs generated from CB or fibroblasts via standard methods.Another cell type of interest is urine-derived cells.Since the first proof of concept in humans[28], it has now been shown that human iPSCs can be successfully generated using urine-derived cells in xeno-free conditions[29].However, apart from the absence of genomic integration after episomal reprogramming and the conservation of a normal karyotype, there is no additional data on the genomic integrity of these cells.Thus, an extensive study of the different cell types relating to the incidence of somatic mosaicism would be highly beneficial.

The method of reprogramming

It is well known that the integration of a viral cassette into the genome is directly linked to a risk of insertional mutagenesis[30].Therefore, in an attempt to overcome this issue, a number of teams have focused on the development of non-integrative reprogramming methods over the last few years, in order to bridge the gap for the use of iPSCs in a clinical setting.Despite the fact that the use of non-integrative reprogramming methods will be a prerequisite in the future, only a few articles report the analysis of the impact of the reprogramming method on iPSC genomic integrity.Initially, focusing on single-nucleotide coding mutations detected by exome capture sequencing, Goreet al[11]did not observe a link between the reprogramming method and the number of protein-coding mutations.The study investigated the impact of three different integrative methods and two non-integrative methods; using a total of 22 iPSC lines.This investigation pioneered the quantification of genomic integrity in hiPSCs.However, one limitation of this study was the use of various hiPSC lines from different laboratories (with each laboratory having its own culture methods) and therefore cannot be regarded as a strict comparison between the different reprogramming techniques.Another large cohort was analyzed by Husseinet al[10].The authors analyzed 22 hiPSC lines generated within their laboratory either through retroviral transduction or piggyBac gene delivery methods.Using Affymetrix SNP array, the authors found approximately 109 CNVs per iPSC line (minimal size 10 kb,10 markers).Once again, the study showed that the delivery method of the reprogramming factors did not influence the resulting data.On the other hand, there are a few articles that highlight the potential impact of the reprogramming techniques using a smaller cohort of hiPSC lines.Chenget al[12]analyzed three hiPSC lines generated by episomal reprogramming of blood-derived CD34+ cells or MSCs.The authors carried out whole genome sequencing as well as CNV analysis and observed 6 to 12 coding mutations per iPSC line, reinforcing previously published data[11], and demonstrated the complete absence of CNV in the three iPSC lines[12].In another article, Boreströmet al[31]successfully reprogrammed both human foreskin fibroblasts and primary chondrocytes using the mRNA reprogramming system provided by Stemgent, which was based on the work carried out by Warrenet al[32].They performed both karyotype and CNV analysis by Affymetrix SNP 6.0 array and observed that all the iPSC lines generated are free of acquired CNV[31].However, the minimal size of CNV detection and the criteria used for detection have not been indicated, furthermore additional whole genome sequencing or exome sequencing would be necessary to fully confirm the development of a “footprint-free” iPSC generation strategy.Due to the importance of addressing this issue, our team wanted to assess the genomic integrity of iPSC lines that were generated using repeated transfections of mRNAs.We also analyzed iPSCs generated from retroviral transduction as a comparative control.All the analyzed hiPSC lines originated from the same fibroblast population and were cultured in the exact same conditions.Using SNP analysis, we demonstrated that mRNA-derived iPSCs do not significantly differ from the parental fibroblasts in SNP analysis, whereas significant differences were noted when comparing retrovirus-derived iPSCs and the parental fibroblasts.On the other hand, CNV analysis confirmed that the number of CNVs may not be dependent on the reprogramming method itself, but instead appeared to be clone-dependent[33].The first evidence demonstrating the link between the number of CNV and the reprogramming method has been made in a mouse model.Parket al[34]reprogrammed murine primary hepatocytes using either a polycistronic vector (lentiviral or retroviral transduction of OKSM factors) or through repeated delivery of purified recombinant proteins.CNV analysis was then performed using a custom 1M array CGH platform on 10 iPSC lines, at passage 18.The authors showed an increase in CNV content in the lenti-miPSC and retro miPSC lines which had from 29 to 53 CNVs depending on the cell line, compared to protein-miPSC lines (from 9 to 10 CNVs)[34].Due to the costly and labor intensive nature of generating hiPSCs using different reprogramming strategies in comparable conditions, in addition to the financial resources and expertise’s required to perform high quality genomic analysis, limited data exists demonstrating the various impacts of non-integrative reprogramming strategies versus integrative methods.Addressing these issues, an extensive study was recently published.The authors compared 3 different non-integrative reprogramming methods (mediated either by mRNA, sendai-virus or episomes) and 2 integrative reprogramming methods (lentivirus-mediated or retrovirus-mediated).Several parameters were analyzed such as reprogramming efficiency, success rates, labor intensity etc.Karyotype and CGH-array analyses were used to investigate the effects on hiPSC genomic integrity.Based on karyotype analysis of representative iPSC lines,the percentage of aneuploid iPSC lines generated was significantly lower (2.3%) for mRNA-iPSCs when compared to retrovirus (8.3%) or Epi-hiPSC (11.5%), a positive advantage for using the mRNA strategy[35].The authors also found that the majority of CNVs are preexisting in the fibroblast population and that the frequency ofde novoCNVs was particularly low in all iPSC lines and no link between the reprogramming method and the number of CNVs was highlighted, reinforcing the conclusion drawn in our laboratory and others.Another study later confirmed these results comparing mRNA, retroviral and sendai- reprogramming strategies and showed only subtle differences among the methods, with most of the detected variants also reported among the fibroblast population[36].In contrast, other studies reported that the number of CNVs and cytogenetic rearrangement in the genomes of the integrating iPSC lines were 20 and 7 times higher than those of the non-integrating iPSC lines,respectively[37,38].

Taken together, initial conclusions of these studies highlight the fact that no method has a zero impact on iPSC genomic integrity, despite the positive advantage of non-integrative reprogramming methods compared to integrative ones.Further investigation, including an extensive analysis using whole-genome sequencing, is required to fully understand the benefits of one reprogramming strategy when compared to another with regards to maintaining iPSC genomic integrity.Indeed,even non-integrative reprogramming requires extensive analysis of genomic integrity of the resulting iPSCs, combining methods that enable the detection of large rearrangements (karyotype analysis) including UPD (SNP analysis), deletions or duplications (CGH array or SNP analysis) and single point mutations (sequencing),especially when aiming at utilizing such cells for therapeutic applications.

The impact of other parameters on the genomic stability of hiPSCs

Once the reprogramming strategy has been defined, specifically the choice of the starting cells and the reprogramming method, it is important to identify other parameters that have been shown to impact the genomic integrity of iPSCs.Chenet al[39]highlight a potential dosage effect of the reprogramming factors on the occurrence of CNVs in iPSCs.The authors analyzed 41 mouse iPSC lines generated from the same parental donor.Varying combinations of the reprogramming factors(the experiments were performed using high-performance engineered factors versus normal reprogramming factors) and various concentrations of reprogramming factors were investigated.Using CGH-array, the authors show that rates of CNVs were negatively correlated with the concentration of the classic Yamanaka factors and that the use of high-performance factors also lead to a significant reduction in the CNV number.In parallel, the use of high reprogramming factor concentration and highperformance factors led to higher number of clones and reduced the time for the first colonies to appear, suggesting a direct relationship between the reprogramming efficiency/strength and the genomic integrity of the iPSCs[39].Sugiuraet al[40]showed that these reprogramming-associated mutations arise during the initial stages of the conversion of these cells.It should also be noted that the culture conditions, in particular media composition, may also play a role in maintaining iPSC genomic integrity.Jiet al[41]showed that supplementing the reprogramming media with antioxidants could reduce the genomic aberrations within the hiPSCs.Utilizing NAC(N-acetyl-cysteine) treatment followed by SNP analysis, the authors were able to significantly reduce the number of CNV by 3.9-fold (12 CNVs versus 47 for the nontreated cells).However, due to the high variability between the results, single point mutations analyzed by high-throughput genome sequencing did not show any defined trends and the mechanism behind the CNV number reduction is not clear.Another group, Luoet al[42], used either a commercial antioxidant or a home-made cocktail of three antioxidant molecules (L-ascorbate, L-glutathione, and α-tocopherol acetate) and demonstrated that the in-house cocktail had a protective effect on one of the two hiPSC lines used.On the other hand, no obvious changes were observed with the use of the commercial product.Mechanistically, the induction of reprogramming factors within cells leads to an acceleration of the growth rate and therefore a higher metabolic demand.In this view, Lammet al[43]highlighted that accumulating aneuploid hPSCs undergo DNA replication stress, resulting in defective chromosome condensation and segregation.Compared to mouse ESCs and fibroblasts, mouse iPSCs had lower DNA damage repair capacity after a specific ionizing radiation performed to induce double strand breaks[44].Moreover, repair mechanisms seems to lose efficiency during long-term passaging of hiPSCs[45].However, this finding has been challenged in a study showing that mouse iPSCs, compared to mouse ESCs and mouse differentiated cells, showed enhanced resistance to mutagenesis with higher level of base excision repair proteins[46].Estebanet al[47]showed that reprogramming is directly associated with increased levels of reactive oxygen species within the cells,which may then lead to single and double-strand DNA breaks; a major cause of genomic transformations.As an illustration, strategies aimed at limiting the reprogramming-induced replicative stress by either increasing checkpoint kinase 1 levels or the adding nucleoside supplements to the culture media has been shown to have a protective effect on DNA damages during reprogramming[48,49].

In conclusion, based on the results from the various investigations, it can be suggested that iPSCs are likely to suffer from genomic instability during the reprogramming process which is directly related to the efficiency of the reprogramming technique.In other words, effective robust reprogramming technique will generate iPSCs with a significantly reduced number of genomic alterations.Hence, it is reasonable to assume that our efforts should now focus on increasing the efficiency of reprogramming as a whole.Furthermore, these conclusions have recently been reinforced in the mouse model.The authors used selected small molecules(PD0325901, SB431542, thiazovizin and ascorbic acid) combined with retroviral reprogramming and showed that, in addition to promoting rapid and efficient reprogramming of mouse fibroblasts, this cocktail of small molecules acts to stabilize genomic integrity (karyotyping analysis) through the activation of the Zscan4 (zinc finger and SCAN domain containing 4) gene and facilitation of the DSBs repair[50].

MAINTAINING GENOMIC INTEGRITY DURING CELL CULTURE AND DIFFERENTIATION

Culturing hiPSCs: Can these cells be pampered?

We have discussed in detail the various factors that can impact the genome integrity of iPSC during the reprogramming procedure.However, specifically with regards to clinical applications, the final product is not the iPSC line itself but a differentiated progeny.PSCs are able to self-renew indefinitelyin vitro, through regular manual passaging (commonly performed once to twice a week for human PSCs) to obtain a sufficient number of cells for further characterization and differentiation assays.A few years ago, human iPSCs were first used in clinics to treat age-related macular degeneration.In this case, approximately 5 × 105iPSCs (easily obtained in culture)were required to generate a hiPSC-derived retinal pigmented epithelium (hiPSC-RPE)sheet with a diameter of 1 cm (sufficient to cover a macular area with a 3 mm diameter)[51].However, the use of iPSCs in other applications such as myocardial injury and non-human primate heart transplantation, require the delivery of 1 × 109cells[52], therefore, from single-colony selection, several successive passages are necessary to obtain a sufficient amount of hiPSCs.Since 2004, it is commonly accepted that culturing hESCs long time is directly associated with classical aneuploidies such as trisomy of chromosome 12, 17 and X, or sub-chromosomal aberrations in chromosome 20 for example[53-57].These aberrations have also been frequently reported in hiPSCs and have been shown to confer specific growth advantages such as the recurrent trisomy of chromosome 12p which contains theNANOGgene involved in cell pluripotency, trisomy of the chromosome 17q including genes likeSURVIVINorSTAT3linked to self-renewal[55], or 20q11.21 duplication being linked to genes with anti-apoptotic effects[56,58].Maysharet al[59]demonstrated that these aberrations,previously detected by CGH-array, could also be identified using a gene expression analysis platform.The technique is based on the knowledge that biased gene expression is directly correlated with such chromosomal abnormalities, enabling a retrospective, albeit less sensitive, examination of iPSC genomic integrity.Laurentet al[9]revealed a trend among CNV apparition describing the recurrent deletions of tumor suppressor genes at early passages and the duplication of oncogenes at late passages.Moreover, analyzing both hESC and hiPSC lines, Husseinet al[10]concluded that long-term passaging is associated with a decrease in both the CNV number and the total size of CNVs.With regards to iPSCs, the majority of the CNVs generated during the reprogramming process (either selected or acquired) disappeared after 30 passages.The authors suggest that DNA repair would not be sufficient to explain this phenomena and hypothesize that hiPSC endure a bidirectional CNV selection, both against and in favor of CNVs present at early passages, where the rate of the selection pressure is more important during the first set of passages.In addition to active selection,de novoCNVs could also be acquired at late passages[10].A recent study performed on 140 independent hESCs lines highlighted recurrent dominant negativeTP53mutations, with the mutant allelic fraction increasing with passage number,suggesting that these mutations confers selective advantage to the cells[60]

During long term culture, it can be assumed that the act of passaging itself creates a stress factor that in turn may induce genomic instability.Various reports suggest that enzymatic passaging may lead to cytogenetic aberrations[57,61,62].Furthermore,additional investigations have demonstrated that mechanical passaging is also associated with cytogenetic abnormalities[63,64].The first systematic analysis of the impact of the passaging method was performed by Baiet al[65]using 3 different hESC lines.Through karyotype analysis and CNV detection using SNP genotyping, the authors first analyzed the number of CNVs present in these cell lines at passage 13(considered as P0 for the temporal analysis).They showed that hESCs that were subsequently passaged, post p13, using enzymatic dissociation (TrypLE + Rho-kinase inhibitor Y27632) rapidly acquired supplemental CNVs (within 5 passages) in comparison with mechanical passaging where the number of CNVs remained stable over 10 passages, and up to 30 passages for one of the three cell lines tested.The authors also demonstrated that single cell passaging, induced by enzymatic dissociation, was associated with increased DNA double strand breaks which in turn could be regarded as a cause of generating the CNVs observed.The study suggested that these abnormalities did not exist in the iPSCs prior to their comparative analysis(based on a mathematic model taking into account the cell population doubling rates)and are therefore induced by the enzymatic passaging itself.The authors also showed that this effect is not due to the presence of ROCK inhibitor in culture.It can reasonably be assumed that these findings are also applicable to hiPSC lines, and in order to optimize their use in clinical applications, the culture time should be maintained at a minimum.Recently, a longitudinal study was published combining CNV analysis and the culturing of 3 hiPSC lines and one hESC line for a highly extended period of time (2 years).Using four selected combinations of culture and passaging conditions, the authors reported that enzymatic passaging on a feeder-free substrate was associated with an increased accumulation of genetic aberrations compared to mechanical passaging on feeder layers[65].They also show that the passaging method has a stronger effect when compared to the substrate, reinforcing previously cited results[66].Besides the passaging method, another study focusing on hESCs highlighted a correlation between cell culture density and the occurrence of DNA damage and genomic alterations, likely triggered by medium acidification due to increased lactate concentrations in high density cultures.In this view, increasing the frequency of media changes restores the DNA damage to its basal level[67].However, 3D culture systems may rapidly replace conventional monolayer growth systems and various reports have demonstrated that hPSCs may be expanded longterm using scalable 3D suspension culture systems[68], in chemically defined and xenofree conditions.One article revealed that hESCs cultured in these conditions retained a normal karyotype for over 20 passages[69].However, the potential protective effect of such a culture system on iPSC genomic integrity is yet to be further investigated.

Another component of the reprogramming system is the oxygen percentage in the incubator where the cells are cultured.The expansion of iPSCs is commonly performed under 5% CO2; 20% O2incubator (normoxia).However, this oxygen condition differs from the one present in the physiological “niche” of pluripotent stem cells found in the embryo inner cell mass, which is in hypoxia[70].iPSCs, by definition,do not have physiological niches and can therefore be generated both in normoxia and hypoxia.Various articles suggest that hypoxia improves reprogramming efficiency[31,71,72]and can induce re-entry of committed cells (spontaneous differentiated cells from ESC culture) into a fully pluripotent state[73].The precise temporal role of hypoxia during human reprogramming has been recently studied[74].In addition, a study demonstrated that the expression of the MMR (DNA mismatch repair), which normally corrects replication errors, was down regulated in both mouse neural stem cells and human mesenchymal stem cells exposed to hypoxia[75].This malfunction is also involve in the genomic instability of several tumors[76].Finally, the MMR defect may partially explain why hypoxia (5% O2) is able to increase reprogramming efficiency, which is also the case with p53 inhibition, but likely at the cost of genomic integrity[77].However, there is no strong evidence showing the potential effect of longterm low-oxygen culture conditions on the maintenance of genomic integrity despite the obvious limited oxidative stress in hypoxic conditions.

Last but not least, the culturing of iPSCs prior to differentiation could include a step of genetic correction, in the case of personalized iPSC-based therapy where the patient’s iPSCs carry a particular pathogenic mutation.In this respect, recent advances have been made using genome editing technologies on hPSCs (including TALEN and CRISPR/CAS9 systems[78,79], enabling the precise targeting of specific sequence in the genome.However, potential off-target modifications of the genome have been reported[80]and will have to be carefully assessed to ensure maximum gene targeting efficiency and specificity[81].Moreover, these genome editing strategies often imply selection of a corrected single cell-derived clone through selection pressure relying on the expression of a gene allowing drug resistance.Again, this selection favors accumulation of genetic damage.

The effects of differentiation on hPSC genomic integrity

The directed differentiation of hPSCs into functional terminally differentiated cells is now possible due to the availability of matrices, cytokines and growth factors required to drive the differentiation process.Depending on the specific cell type,hPSC differentiation can be a long and arduous process for the cells, whereby their stemness characteristics are lost and with time are replaced by the morphology and functional properties associated with the differentiated cell type.During human development, such differentiation processes take several weeks or months, however in vitro specific protocols try to recapitulate the process in a significantly shorter period of time.Considering the significant metabolic and epigenetic changes required in undergoing such a transition, the following question arises: how does the differentiation process affect the genomic integrity of the pluripotent stem cells?Despite its importance when considering the use of the differentiated hPSCs in therapeutic applications, there are a limited number of published studies investigating this aspect.One study differentiated six hESC lines into neural stem cell populations which could be propagatedin vitrofor over 50 passages without entering senescence.The researchers showed that this particular property was associated with a jumping translocation involving chromosome 1q.The analysis was performed after the long-term culture of these derivatives (at least 34 passages), suggesting a strong link between this abnormality and the cell’s adaptation to their new culture conditions[82].In reality, a variety of genetic abnormalities could occur, during the differentiation process itself, at a significantly more rapid rate.In a study previously described, the authors analyzed the CNV occurrence during a 7 d experiment differentiating hESCs into motor neuron progenitors.They found an occurrence of partial duplication of 3 segments of chromosome 20 at day 7 in one of the differentiation experiments.This duplication was absent at day 2, suggesting that this type of abnormality could occur in a limited time, as short as 5 d[9].Another article studied the presence of CNV in neuroprogenitors derived from a hESC line or a patient-specific iPSC line using CGH array.They demonstrated that these differentiated cells contained CNVs, including CNVs acquired from the PSC line (and detectable in this line) but alsode novoCNVs generated during differentiation.Some CNVs were also shown to have been lost during the differentiation process,suggesting that maybe certain CNVs could offer a selective advantage or disadvantage for differentiation.On the other hand, Kammerset al[83]did not report any CNVs in iPSC-derived megakaryocytes compared to their undifferentiated counterpart, despite expected upregulation of highly biologically relevant gene such as those related to megakaryocyte development, platelet activation, blood coagulationetc.

Lastly, in the context of liver therapy, using three different hepatic differentiation experiments, we demonstrated that node novoCNV were triggered using our differentiation protocol[33], but the time scale (22-24 d) of our differentiation protocol probably does not allow emergence of detectable CNV due to the limited number of mitosis.To our knowledge, there are no additional studies focusing on the impact of differentiation on iPSC genomic integrity.This could be partially explained by the fact that the majority of quality controls necessary for use of the cells in therapeutic applications are carried out at the iPSC stage, defining iPSC master cell banks, as it is commonly known how deleterious reprogramming can be on the cell genomic integrity.However, this option negates the direct impact of differentiation on the cell genome.Another reason for the lack of interest could also be due to risk minimization based on the fact that differentiated cells have a considerably decreased ability to proliferate, compared to their pluripotent counterparts (which should be eliminated upon differentiation or selectively removed during the process).However, in the case of liver cell therapy for example, in addition to transplanting fully mature and functional hepatocytes, an alternative could be to transplant hepatic progenitors which are able to maturein vivoin an optimal microenvironment.As these progenitors maintain their proliferation capacity, the concept would be to improve overall cell engraftment and proliferation once transplanted.Nonetheless, there is so far no consensus regarding the exact time point during the differentiation of the cells at which they will be optimum for use in transplantation.Moreover, large scale investigations using high-throughput genomic analysis techniques are yet to be carried out.Therefore it is imperative to continuously assess the impact of differentiation on the genomic integrity of cells, including the development of more reliable and efficient differentiation protocols which have the potential for use in clinics.

WHAT ARE THE IMPACTS OF GENOMIC ALTERATIONS IN HIPSCS AND THEIR DERIVATIVES?

Enforcing hiPSC genomic integrity quality control

The last section of the review will try to address the following question: What are the impacts of these genomic alterations in hiPSCs and their derivatives? This issue is not simple and has already been touched on in recent reviews[81,84].For the purpose of discussion, let us assume that the relevance of the hiPSC mutations, and therefore the importance of performing genomic integrity quality control, is highly dependent on the application the cells will be used for.To illustrate this idea, various potential examples of hiPSC based therapies will be discussed.In the first example,differentiated cells derived from hiPSCs may be usedex vivoto generate extracorporal devices.More specifically, hiPSC-derived hepatocytes could be used in external bio-artificial devices aimed at temporarily replacing the liver for patients suffering from acute liver failure whilst they wait for an organ transplant.In this case,the important parameter is the functionality of the cells and because the cells are not injected into the patients, we could accept that mutations will not prevent their use in this application (assuming that such mutations do not have a direct impact on hepatic functions).Another example is the use of hiPSCs to perform personalized therapy to treat a children affected with a life threatening disease, as the strategy has long term benefits.Once the patient’s cells are collected, patient’s specific iPSC lines could be generated for the purpose ofin vivocell therapy.In this case, stringent genomic integrity controls should be performed in an attempt to identify a “safe” iPSC line (the questions of which exact tests need to be performed will be subsequently discussed).Ideally, autologous iPSCs would be the adequate strategy in order to avoid an immune response.However, despite the continual progress in the techniques of hiPSC reprogramming, culture and differentiation, it remains to be an expensive, long and arduous task.It appears that it is increasingly more feasible to use hiPSC banks with regards to providing large scale medical care (or semi-personalized hiPSC based therapy)[85].For a successful organ or cell transplantation, there are three immunogenic challenges that need to be overcome; these include human leukocyte antigen (HLA), blood groups in some cases such as liver transplantation and minor antigens compatibility.For blood group compatibly, the selection of group O donors could avoid part of the immune reaction.However, HLA is one of the most polymorphic loci in the mammal’s genome with thousands of alleles recognized.Various studies have estimated the number of pluripotent stem cell lines that are required to be stored in a national cell bank to cover the HLA diversity.Using the United Kingdom population as an example, 150 selected donors could match about 85% of the country population with minimal immunosuppression (including 18.5%with a full match)[86].Similar prediction models have been calculated with hiPSCs and,according to one, a bank comprising of 100 hiPSC lines exhibiting the 20 most frequent HLA in each of the following populations would still exclude 22% of the European Americans, 37% of the Asians, 48% of the Hispanics, and 55% of the African Americans[87]; this highlights the fact that countries with more diverse populations would require higher number of cell lines to be stored within the cell bank.Indeed,Fragaet al[88]characterized 22 human embryonic cells and observed that only 0.011%of the Brazilian population could be matched with these cell lines.It should be noted that the Brazilian population is known for its high degree of genetic diversity.However, HLA compatibility would not negate the use of immunosuppressive drugs,as the role of minor histocompatibility antigens in the rejection process should not be underestimated when transplanting cells from genetically unrelated individuals[89].In particular cases, depending on whether the cells are transplanted into immune privileged sites or for short periods of time, immune suppression may not be required.While generating such cell banks, which are likely to be feasible within an international consortium, parallel efforts should be carried out to screen the cells for genomic abnormalities prior to banking.Herein, the impact of cell culture and differentiation on the cell genome is neglected; as such an additional quality control step should be performed subsequent to the cells use in therapy.This is a supplementary control step and should not replace the baseline iPSC genomic integrity controls.

The next important question is to define which techniques or combination of techniques should be used to screen for genomic abnormalities.Some consensus and guidelines are slowly emerging but there is a lack of standardization worldwide.Despite that there is currently no defined consensus, some guidelines are slowly emerging[90].Karyotype analysis is likely to be a requirement and could also be used to rapidly eliminate aberrant iPSC lines.CNVs should also be analyzed, using CGH array or SNP array[91].However, only SNP arrays enable consecutive LOH regions detection attesting the possibility of UPD.Genome or exome sequencing is highly informative but specific care should be taken when using these techniques for CNV detection.Besides, Kanget al[92]highlighted the need to monitor mtDNA mutations in iPSCs, especially those generated from older patients, as well as metabolic status of those iPSCs.Finally, once the data is collected, the level of tolerance applied should be defined.It should be highlighted that the qualification will always depend on the resolution of the technique used and the analysis parameters: the stringency of the quality control required is directly linked to the functional impacts of these mutations and the application such cells will be used in.

The complexity of predicting the subsequent impact of genomic abnormalities

Assuming that the hiPSC lines available for banking will contain a few mutations,another challenge will be to assess whether these mutations could regard the cells as safe for use in clinics.Once again, the answer is not straight forward.For example, it has been estimated that about 12% of the human genome contains CNVs.These CNVs contribute to 0.12 % to 7.3% of the genome variability in humans[93]and are often benign.More than 300000 CNVs which are not associated with a clinical phenotype have been identified in the general population and are catalogued in the Database of Genomic Variants[94].Several genes can be found within the boundaries of large CNVs and the resulting functional changes are not easy to predict.Genomic variants associated with clinical symptoms are shared through the DECIPHER interface[95].Other online tools are able to help predict the effects of the variants such as Variant Effect Predictor[96].These programs are examples of online tools used to help identify genomic abnormalities present in hiPSC lines and predict the possible effects.Despite the availability of such tools, each mutation should be individually considered, and its attributed importance will depend on the application the cells will be used for.Strong evidence exists on the potential impact of CNVs in the context of hiPSCs, described in a recent article which documents the generation of several integration-free hiPSC lines from patients affected with two neurodevelopmental disorders directly caused by CNVs of 7q11.23 locus[97].The CNVs involve the loss or gain of approximately 28 genes, leading to Williams-Beuren Syndrom (OMIM 194050) in the case of deletion, or Williams-Beuren region duplication syndrome (OMIM 609757) in the case of duplication.Through hiPSC generation, the authors documented the CNV present in the patient’s fibroblasts and notably performed transcriptional analyses of patientspecific iPSCs at the pluripotent stage and once they were differentiated into neuronal cells, cardiac cells and gastrointestinal cells.Compared to the control hiPSC lines, the study first showed that several hundreds of genes are differentially expressed;highlighting a network effect of the 7q11.23 dosage imbalance.They also observed that several of the affected pathways were already dysregulated at the pluripotent state, and various other expression changes were cell-type specific.This article confirms once more that hiPSCs can be good models to mimic pathologies and provides clear evidence of the functional impact of a pathologic CNV in hiPSCs and their derivatives.Importantly, showing that even at the undifferentiated state big abnormalities are patent, this example might indicate that previously undescribed mutations/CNV are innocuous when undifferentiated or differentiated hiPSCs bearing these abnormalities exhibit transcriptomes comparable to that of normal counterparts.

The link between pluripotency and tumorigenicity

The gene expression networks responsible for the pluripotency of hPSCs are closely related to those implicated in oncogenesis[98]; and the culture of hiPSC could be associated with the positive and negative selection of genes involved in oncogenesis or cell cycle regulation.Both pluripotency and oncogenicity are linked to high proliferation capacity, self-renewal and in some cases differentiation capacity.Key factors involved in the pluripotent network have been shown to be involved in the oncogenic processes.For example, the transcription factor NANOG plays a major role in the self-renewal of CD24+ cancer stem cells in hepatocellular carcinomas[99]and SOX2 promotes the survival of cancer stem cells in numerous malignant tumors including lung and esophagus cancers[100].The use of integrative vectors for reprogramming is particularly dangerous in this context.One study showed that cMyc reactivation after reprogramming lead to the generation of tumors in chimeric mice[101].Besides, a study described a long term (47 d) follow-up after the transplantation of hiPSC-derived neurospheres in a spinal cord injury mouse model.The authors described the occurrence of tumors linked with the reactivation of the Oct3/4 transgene associated with epithelial to mesenchymal transition based on transcriptomic analysis[102].Another article highlights the direct and indirect interactions that exist between genomic integrity and pluripotency networks in human PSCs using transcriptomic and cistromic analyses[103].Tumorigenicity is an intrinsic property of iPSCs and the advances of reprogramming, culture and differentiation protocols may not be sufficient to suppress the risk for use in clinical applications.With this regard, sorting-based strategies have been developed to purify the cell populations before transplantation.These strategies include cell selection using cytotoxic antibodies against PSCs[104], magnetic sorting depleting PSCs[105]or enrichment of the differentiated cells[106].Estimations on the minimum number of cells sufficient to generate teratomas post injection of human PSCs in immunodeficient mouse have been investigated.Results suggest that 10000 cells are required if the injection is performed in the skeletal muscle and 100000 cells if injected into the myocardium[107]; this number being likely different if non-immunodeficient animals are used.In the case of the adult mouse liver in which 800000 hepatocytes should be transplanted, 10000 cells represent only 1.25% of the entire population, signifying the importance of achieving a high differentiation yield and an effective cell sorting technique.In a Parkinson disease model in non-human primate, a study showed that the residual presence of ESCs in the preparation of neuronal cells differentiated from ESCs induced teratoma formation after cell injection in primate brains[108].However,the injection of more mature terminally differentiated cells circumvented this outcome.Depending on the target organ, the maturity state of differentiated cells that is required has yet to be defined.However, satisfying transplantation results have been obtained in a non-human primate model using ESC-derived cardiomyocyte progenitors[109], suggesting progress within this area.

What are the alternatives?

This review highlights that iPSCs are prone to genomic instabilities and are intrinsically linked to tumorigenicity.Therefore the next question to address is whether we have access to another cell type equivalent to iPSC which could help circumvent this problem.Recent findings demonstrating that hPSCs could be derived by nuclear transfer into human oocytes (called NT-ESCs for Nuclear Transfer-Embryonic Stem Cells)[110]propose an alternative method of generating pluripotent stem cells with a desired genotype (albeit not avoiding ethical issues due to the need of oocytes).In this context, the genetic and epigenetic integrity of these cells have been compared with those of isogenic iPSCs (generated from the human fibroblasts used for nuclear transfer).In addition to showing similar gene expression and DNA methylation profiles, NT-ESCs and hiPSCs have comparable numbers ofde novocoding mutations, suggesting that regardless of the derivation approach, the nuclear reprogramming itself is linked to genomic aberrations[111,112].Another possibility to bypass this issue is to take advantage of the recent advances made in the field of trans-differentiation, also known as direct reprogramming.Human fibroblasts have been successfully transdifferentiated to hepatocytes[113], dopaminergic neurons[114]and cardiomyocytes[115].However, it should be assessed whether these techniques are beneficial in terms of genomic integrity, as they avoid an important step of cell dedifferentiation.The epigenetic and genetic remodeling linked with transdifferentiation could be less detrimental when compared to reprogramming and the subsequent differentiation of the cells.Further investigation is required to fully understand the benefits of using novel strategies such as trans-differentiation to replace the use of iPSCs in cell therapy.However, it should be noted that transdifferentiation of somatic cells, which have a limited amplification capacity, will not solve the quantitative problem of cell availability.

CONCLUSION

Despite that hiPSCs are prone to genomic instability, a reliable quality control combined with optimized reprogramming, culture and differentiation conditions may be sufficient to minimize the impact on the cell genome.Nevertheless, a footprint-free cell population derived from iPSCs seems difficult to obtain for now and even the necessity to reach such a goal is questionable based on the planed application.Moreover, the fact that hiPSCs contain more genomic variations than cultured somatic cells is not obvious.A recent study derived subclones from fibroblasts and clonal iPSCs from the same population and highlighted by targeted deep sequencing that clonal iPSCs and fibroblast subclones displayed comparable numbers ofde novovariants[21].Thus, somatic mosaicism seems to be one important parameter, although underestimated because of the technical difficulties to detect mosaic below 5%-10%whatever the techniques used[116].Finally, besides genetics, epigenetic factors have not been addressed in this review but may also play a role in the heterogeneity and behavior of hiPSCs and are important parameters to address in the quest of an optimal iPSC-derived population.

ACKNOWLEDGEMENTS

The authors are grateful to all the persons who participate closely or not to the invaluable discussions which led to the idea, writing, maturation and publication of this review.