APP下载

Applications of single cell RNA sequencing to research of stem cells

2019-12-22XiaoZhangLeiLiu

World Journal of Stem Cells 2019年10期

Xiao Zhang, Lei Liu

Xiao Zhang, Lei Liu, State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu 610041, Sichuan Province, China

Abstract

Key words: Stem cells; Heterogeneity; Single cell RNA sequencing; Developmental trajectories; Cell subpopulations

INTRODUCTION

Stem cells (SCs) are immature cells that are not fully differentiated.Based on the characteristics of self-renewal and pluripotent differentiation potential, SCs show great promise for widespread clinical applications, particularly to some refractory diseases such as stroke, Parkinsonism, myocardial infarction, and diabetes.Furthermore, as seed cells in tissue engineering, SCs have been applied widely to tissue and organ regeneration.Although the study of SCs has been ongoing for decades, there are still many issues to be resolved.One of the most striking phenomena is that even most SCs are homogeneous obviously within a single tissue,there are diverse subpopulations of cells showing unique distinct functions,morphologies, developmental statuses, or gene expression profiles compared with the other cell subpopulations[1-3].Previous studies have indicated that the heterogeneity of cellular states is caused by the cell physiology, differentiation state[4,5], and its inherent plasticity[6-8], hindering further studies of biological characteristics and applications of SCs[9].Although bulk-based approaches using microarrays of high throughput RNA sequencing (RNAseq) techniques provide certain important insights into SCs, these approaches are limited because results about structures and functions reflect average measurements from large populations of cells or the results are predominantly obtained from cells with superior numbers[10,11], overlooking unique biological behaviors of individual cells, conceal cell-to-cell variations.As a consequence,heterogeneity is still a major issue to be resolved in the research and applications of SCs.

Studies conducted at the single cell level are imperious to understand the heterogeneity of SCs.Although low throughput, single cell analysis techniques, such as single cell quantitative PCR, and single cell real-time quantitative PCR, have been used to test certain molecular markers of single cells, they are limited to studying small number of genes.

Based on the current technological advances in single cell technologies and the next-generation sequencing (NGS) approach, a new technique, single cell RNA sequencing (scRNA-seq), provides an effective measure to resolve the above mentioned issues[12].scRNA-seq refers to whole transcriptome amplification at the single cell level, which comprises reverse transcription of mRNA into cDNA followed by cDNA amplification, and then high throughput sequencing.Compared with traditional sequencing techniques, scRNA-seq can efficiently describe heterogeneity of cell subpopulations, measure cell-to-cell variability of gene expression, identify previously unreported cell types and define associated cell markers[13,14], and describe developmental trajectories[1,15,16].scRNA-seq has attracted much attention since its first discovery in 2009, and the applications to research of SCs grow continuously,particularly for the study of heterogeneity and cell subpopulations in early embryonic development[17,18].Here, we discuss the scRNA-seq technique, its applications to research of SCs, and the future perspectives.

SINGLE CELL RNA CHALLENGES

Single cell isolation methods, whole transcriptome amplification at the single cell level, and high throughput RNA-seq have led to the development of modern scRNA-seq platforms.Current workflow of scRNA-seq is organized in a set of steps: Single cell isolation, reverse transcription of mRNA, cDNA amplification and sequencing library construction, high throughput sequencing, and computational analysis[19].In the steps above, isolating target single cells from a block of tissue or cultured cells is a critical step in scRNA-seq, which allows research at the single cell level.There are many approaches to isolate single cells, such as microfluidic systems, fluorescenceactivated cell sorting, micromanipulation, and laser capture microdissection[20].Compared with other isolation methods, microfluidic systems isolate and capture single cells in micron-scale channels, providing many advantages by allowing cell isolation from a small population, high throughput, reducing reagent costs, reducing pollution, and improving accuracy.This method can also be used to isolate rare cells.It provides a robust foundation for single-cell sequencing-based analysis, which is considered as an excellent method to isolate single cells[21,22].Whole transcriptome amplification at the single cell level is critical for scRNA-seq and plays a significant role in producing adequate cDNA to construct the sequencing library.PCR-based methods, including degenerate oligonucleotide primed PCR, primer extension preamplification PCR, and ligation-mediated PCR, are common methods to achieve cDNA amplification.More recently, multiple displacement amplification and multiple annealing and looping-based amplification cycles, as advanced techniques in the field of single cell amplification have been found to produce a higher cDNA yield, higher fidelity, and lower amplification bias compared with PCR-based whole transcriptome amplification methods.The advancement of NGS technologies has facilitated single cell sequencing, enabling millions of DNAs to be sequenced simultaneously and allowing thorough analyses of genomes and transcriptomes.At present, prepared libraries undergo sequencing using high throughput RNAseq platforms including Fluidigm C1, DropSeq, Chromium 10X, SCI-Seq[23].In terms of sequencing depth, the recent single cell transcriptomics sequenced 0.1-5 million reads per cell.To get saturated gene detection, 1 million reads per cell is generally recommended[19].It is the rapid development of the three main technologies mentioned above that has increased the accuracy of scRNA-seq, extending its application and becoming a rapid focus of biomedical research.After high throughput sequencing, a comprehensive and systematic computational analysis is performed.In the current single-cell studies to decode heterogeneity in SC populations, methods conducted in this step include read quantification, quality control, dimension reduction and visualization of data,unsupervised clustering analysis, and differential expression analysis to interpret these acquired data sets.Researchers, who perform computational analysis, require knowledge of some programming languages, so that they can interact with the preestablished algorithms for aligning, clustering, and visualizing the data.Some specialized algorithms developed by advanced bioinformatics labs are in general used in unsupervised clustering analysis and differential expression analysis, such as DESeq2, MAST, and an easy to-use package, Seurat.

APPLICATIONS OF SINGLE CELL RNA SEQUENCING TO RESEARCH OF STEM CELLS

In the research of SCs, scRNA-seq is mainly used to identify cell subpopulations,analyze rare cell types, and describe developmental trajectories and regulatory networks.

Identification of cell subpopulations

One major application for scRNA-seq is research of stem cell heterogeneity.By acquiring unbiased samples of SCs from a tissue and generating transcriptomes for each cell, clustering cells is performed based on their expression data.Established clustering and dimension reduction methods, such as hierarchical clustering analysis,K-means, and principal components analysis, are usually applied to group cell subpopulations.The principle is that cells are sorted according to their expression levels of genes quantified by unique molecular identifiers.Cluster information is then overlaid on cells in two to three-dimensional t-distributed stochastic neighbor embedding plots that are used to visualize cell subpopulations.Performing statistical analysis to identify significantly differentially expressed genes between subpopulations to define cell markers assists in best discriminating different clusters,purifying, and distinguishing some cell subpopulations of interest.To functionally characterize the clustered subpopulations, functional annotation of differentially expressed genes is an indispensable step to analyze transcriptome data.

As an effective tool for research on heterogeneity of SCs, scRNA-seq is commonly applied to cancer SCs, adult SCs, and induced pluripotent SCs.

Cancer is regarded as one of the most complex and heterogeneous diseases, and cancer SCs are a major source for the formation, metastasis, and drug resistance of tumor.Intratumoral heterogeneity indicates a diverse pathological potential among cancer SCs, which increases the difficulty in targeting therapy of cancer.Therefore,the heterogeneity of cancer SCs needs to be urgently addressed in cancer research,diagnosis, and treatment.The genetic information and differences in the expression and control of genes among individual cells can be detected by scRNA-seq, making it possible to understand intra-tumoral heterogeneity, map different clones in tumors,and analyze cancer SCs, which is informative for cancer research.Recent studies employing scRNA-seq for cancer research have investigated breast cancer[24,25], lung cancer[26,27], renal cell cancer[28], glioblastoma[29,30], and hepatocellular carcinoma[31].

Adult SCs, residinge in almost all tissues of the body, have a self-renewal capacity and multi-lineage differentiation potential under certain conditions.They are presently a research focus in the stem cell field.Among all types of adult SCs,adipose-derived mesenchymal stromal/stem cells (ADSCs) have received increasing interest for immune and hematopoietic modulation, anti-inflammation effects, proangiogenesis properties, and tissue repair and restoration, owing to their relative ease of harvest, abundance, and multi-lineage differentiation[32,33].Numerous studies have demonstrated that ADSCs are heterogeneous populations consisting of various cell subtypes[34].Accurately delineating subpopulations by functional properties or surface marker expression is necessary to promote their further translation to clinical benefits.Schwalieet al[35]revealed three distinct subpopulations of ADSCs and adipose precursor cells in subcutaneous adipose tissue using scRNA-seq.They demonstrated that one of these subpopulations, CD142+ABCG1+cells, suppress adipocyte formationin vivoandin vitroin a paracrine manner.Furthermore, they showed that the mechanism of this action possibly involved Spink2, Rtp3, Vit, and/or Fgf12 genes.These findings suggested a potentially critical role for CD142+ABCG1+cells in modulating the plasticity and metabolic signature of distinct adipose cell-containing systems.Other studies on heterogeneity of adult SCs using scRNA-seq have investigated hematopoietic SCs[36]and neural SCs[37,38].

Induced pluripotent SCs are capable of unlimited self-renewal and can give rise to specialized cell types based on stepwise changes in their transcriptional networks.The research has indicated that gene expression is highly heterogeneous between induced pluripotent SCs, and the heterogeneity of cell states has not been described at a global transcriptional level.Nguyenet al[39]used scRNA-seq to study the heterogenous states of human induced pluripotent SCs represented in pluripotent cultures at the transcriptional level.Four independent subpopulations of cells were identified and defined.Next, cell trajectories of transition between pluripotency states were defined.In their study, the largest dataset of single cell transcriptional profiling of undifferentiated human induced pluripotent SCs was provided, which increased our understanding of the complexity of pluripotent SCs.

Analysis of rare cell types

The second area that benefits immensely from scRNA-seq is the analysis of rare cell types.Commonly used approaches, such as microarrays and the NGS approach of high throughput RNA sequencing, are limited to large populations of cells.In cases where samples are available in only trace quantities, each of which can have a distinct function and role, the transcriptome can hardly be profiled by sequencing using these techniques.scRNA-seq can be used to characterize hidden subpopulations of rare cell types and measure gene expression in individual cells, overcoming the limitation of the cell sample size during traditional transcriptome analysis.Although a limited number of cells can influence the results, it has been demonstrated that 30 cells is the minimum sample size to sufficiently analyze the complexity of large cell subpopulations[40].In the early human embryo, only a very small number of embryonic cells and embryonic SCs can be isolated, which makes it difficult to study the gene regulatory network controlling human embryonic development by traditional methods.The problem has been solved by the development of scRNA-seq.Yanet al[41]analyzed 124 individual cells from human preimplantation embryos and human embryonic SCs at various passages using scRNA-seq.The number of maternally expressed genes was 22687, which was significantly more than 9735 maternal genes detected previously by cDNA microarray.The results provided a comprehensive framework of the transcriptome landscapes of human early embryos and embryonic SCs.Additionally, scRNA-seq is used in the research of trace quantities of cancer SCs.

Description of developmental trajectories and regulatory networks

Another important application of scRNA-seq is the description of developmental trajectories and identification of gene regulatory networks.Mapping the pathway of differentiation and elucidating the underlying molecular controls are major goals in the development of stem cell technologies.scRNA-seq can be used to study the molecular dynamics of various cell types during development, map developmental trajectories, and reveal cell fate changes.During these processes, proliferative progenitor cells and stationary cells are detected, cell states that exist only transiently or during discrete time windows are identified, dynamic changes in the gene expression lineage of different cell types are recorded, and visualization of developmental trajectories is ultimately achieved.The application of SCs to the description of developmental trajectories and regulatory networks has been reported in many studies.Hematopoietic SCs, branching into all blood cell lineages of erythrocytes, leukocytes, and lymphocytes, must follow a highly controlled route.The molecular networks that control stem cell fate decisions, such as cell division or quiescence and differentiation or self-renewal, are still unclear.The chronological developmental trajectories of single hematopoietic cells from SCs to mature cells have not been described.Bendallet al[9]provided a comprehensive analysis of human B lymphopoiesis and constructed developmental trajectories from hematopoietic SCs through to naive B cells using scRNA-seq expression data, laying the foundation to apply this approach to other tissues.Muscle SCs activate, divide, and give rise to muscle progenitors when injuries occur.scRNA-seq was applied to capture the transcriptional state of individual muscle SCs and primary myoblasts.Dell'Orsoet al[42]reported the homeostatic and developmental dynamic trajectories of regenerative adult muscle SCs and primary myoblasts, and described the relative transcriptional changes relative to metabolic pathways.In addition, other studies of developmental differentiation and gene regulation networks of SCs using scRNA-seq have focused on human pluripotent stem cell differentiation pathways[39], molecular trajectories of the early progenitors during human cord blood hematopoiesis[43], and developmental dynamics of adult hippocampal quiescent neural SCs[44].

In addition to the above applications to research of SCs, scRNA-seq can be used in the identification of cellular states such as the stage or speed of the cell cycle.

PERSPECTIVES

Recent progress in the development of scRNA-seq has been rapid and exciting.scRNAseq has enabled us to explore molecular profiles at the single cell level, which allows characterization of cellular heterogeneity and development.With the rapid development of scRNA-seq, many challenges have been encountered in the analysis,integration, and interpretation of single cell data.The limited efficiency of RNA capture and cDNA amplification bias may lead to distortion of gene expression profiles, which artificially magnify the cell-to-cell variability[3].Numerous methods have been developed to essentially address these issues, such as optimization of protocols and improvement of computational and statistical methods.In summary,the past few years have witnessed remarkable growth of this technique, a trend we believe will continue, enabling deeper understanding of the biological complexity of SCs and related diseases.