APP下载

Low complexity domains,condensates,and stem cell pluripotency

2021-07-02MunenderVodnalaEunBeeChoiYickFong

World Journal of Stem Cells 2021年5期

Munender Vodnala,Eun-Bee Choi,Yick W Fong

Munender Vodnala,Eun-Bee Choi,Yick W Fong,Department of Medicine,Cardiovascular Division,Brigham and Women's Hospital,Harvard Medical School,Boston,MA 02115,United States

Yick W Fong,Harvard Stem Cell Institute,Cambridge,MA 02138,United States

Abstract Biological reactions require self-assembly of factors in the complex cellular milieu.Recent evidence indicates that intrinsically disordered,low-complexity sequence domains (LCDs) found in regulatory factors mediate diverse cellular processes from gene expression to DNA repair to signal transduction,by enriching specific biomolecules in membraneless compartments or hubs that may undergo liquidliquid phase separation (LLPS).In this review,we discuss how embryonic stem cells take advantage of LCD-driven interactions to promote cell-specific transcription,DNA damage response,and DNA repair.We propose that LCDmediated interactions play key roles in stem cell maintenance and safeguarding genome integrity.

Key Words:Liquid-liquid phase separation;Embryonic stem cell;Pluripotency;Low complexity domain;Transcription;DNA damage response

INTRODUCTION

Embryonic stem cells (ESCs) are derived from pluripotent cells in the inner cell mass of the blastocyst[1,2].ESCs are highly proliferative cells that can self-renew indefinitelyin vitro.In addition to replication stress due to an abbreviated cell cycle[3],it has been shown that ESCs are transcriptionally hyperactive[4].The increased replication and transcriptional burdens in ESCs promote genome instability[5-8].Therefore,ESCs are under increased pressure to conduct transcription and DNA repair efficiently to maintain stem cell identity and genome integrity.Deciphering the mechanisms by which ESCs safeguard transcriptional and genomic fidelity is important for understanding pluripotency,and for translating stem cell-based therapies.

In response to developmental signals,ESCs exit from self-renewal and undergo differentiation to generate every cell type in the body.This highly dynamic process requires coordinated changes in gene expression patterns.Genes required for stem cell self-renewal are silenced,while genes encoding developmental regulators that are normally repressed are reactivated to direct the differentiation of ESCs into cell types representing the three embryonic germ layers[9].Wholesale changes in gene expression are accompanied by reconfiguration of chromatin structure in differentiating ESCs,whereby previously euchromatic regions associated with pluripotency genes are packaged into repressive heterochromatin[10,11].Conversely,genomic loci associated with lineage-specific genes become euchromatic,thus permissive to transcriptional activation[12].

A fundamental problem in stem cell biology (and cell biology in general) is how complex biochemical reactions (e.g.,transcription,DNA replication,repair,chromatin remodeling,and signal transduction) are organized and regulated inside a densely packed cellular space.While specific cellular reactions can be compartmentalized within classic membrane-enclosed organelles such as endoplasmic reticulum and Golgi apparatus,those that occur inside the nucleus present a unique challenge because the nucleus lacks such organelles to spatially and temporally control biological reactions,where inadvertent “mixing” of these reactions could prove fatal to a cell.Indeed,it has been shown that proteins in the nucleus are often enriched in discrete membraneless compartments.For example,factors involved in mRNA splicing are concentrated in the Cajal bodies to facilitate assembly of spliceosomal machinery[13].Nucleoli are sites of ribosome biogenesis enriched in factors required for ribosomal RNA transcription and processing[14],and were recently identified as a protein quality control compartment[15].Under specific conditions such as biomolecular concentration,temperature,pH,and salt concentration,biomolecules can coalesce and separate from bulk solution in cells,as condensates reminiscent of oil droplets in water[16-18].This process,termed liquid-liquid phase separation (LLPS),underlies the formation of membraneless compartments such as the nucleolus and Cajal body[19,20].Recent work also implicates biomolecular condensates in a wide range of cellular processes,enriching specific macromolecules within distinct compartments and increasing local concentration to overcome activation barriers[13,14,21-23].

In this review,we examine the emerging roles of protein hub formation and condensation in compartmentalizing and coordinating biochemical reactions in the complex nuclear environment.We discuss how protein condensates enhance cellular reactions critical for stem cell function,facilitate crosstalk between cellular processes to generate complex responses to changing cellular environment,and how these responses collectively safeguard stem cell fidelity.

PHASE SEPARATION OF PROTEINS CONTAINING INTRINSICALLY DISORDERED REGIONS

Intrinsically disordered regions (IDRs) are prevalent in eukaryotic proteome,particularly among regulatory proteins such as transcription factors[24].These unstructured regions are often composed of low-complexity sequences limited in amino acid diversity.Low complexity domains (LCDs) are enriched in glycine,and polar residues such as serine,asparagine,glutamine,and tyrosine.Other IDRs are characterized by clusters of positively and negative charged amino acid (e.g.lysine,glutamic acid)interspersed with hydrophobic residues such as phenylalanine[25].These unique amino acid compositions found in LCDs have been shown to promote LLPS by polar or charge-charge intermolecular interactions in a concentration dependent manner[25,26].In addition,the flexible nature of LCDs is thought to facilitate their interaction with multiple protein partners,by rapidly adopting an ensemble of conformations[27,28] Indeed,LCD’s ability to bind multiple proteins,also known as multivalency,is a major driving force of LLPS by lowering threshold concentration[29].It is worth emphasizing that while LCDs are unstructured sequences,they do not always bind promiscuously to any proteins;instead,they can be selective for binding partners[30-32].More importantly,because these selective multivalent interactions are usually weak and transient,as opposed to the high affinity (but low valency) “lock-and-key”interactions found in ligand-receptor complexes,they allow dynamic regulation of LLPS properties,condensate composition,and biochemical reactions that take place inside these bodies.In the following sections,we discuss examples wherein LCDdriven interactions play a critical role in regulating cellular processes relevant to ESC biology.

LCDS IN TRANSCRIPTIONAL ACTIVATION AND REPRESSION IN ESCS

Transcriptional activation

During early embryonic development,pluripotent cells in the inner cell mass of the blastocyst rapidly expand through self-renewal[33].Buttressing this critical developmental period is a robust gene regulatory network that functions to maintain pluripotency in these cells[34-36].High transcriptional activity in ESCs has been shown to skew towards genes that encode transcription factors and chromatin remodeling machinery[4],likely as an adaptive measure to meet the increased transcriptional demand.How expression of these factors stabilizes the pluripotent state in ESCs has become apparent through a number of seminal studies.Transcription factors octamer-binding transcription factor 4 (OCT4) and sex-determining region Ybox 2 (SOX2) play a pivotal role in activating stem cell pluripotency[37-42].Cooperative binding of OCT4 and SOX2 along with a wide array of transcription factors and transcriptional coactivators at gene enhancers lead to the formation of“super enhancers.” Super enhancers differ from typical enhancers by their unusually high density of transcription factors spread over a relatively large genomic region measured in kilobases[43-45].These transcription factor-rich domains are thought to fuel higher transcriptional output by the RNA polymerase II (Pol II) machinery.The cooperative nature of transcription factor assembly at super enhancers is thought to allow the formation or collapse of super enhancers over a relatively small concentration range of transcription factors[44],and is therefore proposed to play an important role in dynamic gene expression during ESC self-renewal and differentiation.Recent studies on LCDs,which are highly enriched in transcription factors,provide important insights into how these high-density transcription factor hubs are formed to drive cell-specific transcription in ESCs (Table 1) [46-54].

Table 1 Low complexity domain-containing proteins in transcriptional activation and repression in embryonic stem cells

Mediator

The ubiquitous transcriptional coactivator Mediator is a large,multisubunit complex that is required for transcription of most Pol II genes,by virtue of its ability to interact with a wide array of transcription factors and Pol II[55,56].Mediator stimulates transcription by functionally and physically connecting transcription factors at enhancers to the Pol II machinery at promoters[57],where distal enhancers are brought to proximity to their target promoters through DNA looping by cohesion-CTCF (CCCTC-binding factor)[58,59] (Figure 1A).Small hairpin RNA-mediated screens indicated that downregulation of subunits of the Mediator complex compromises expression of OCT4/SOX2-dependent genes in mouse ESCs[60].Consistent with its role as a coactivator for OCT4/SOX2,Mediator colocalizes extensively with OCT4 and SOX2 across the ESC genome[61].

The mediator complex subunit 1 (MED1) of the Mediator complex contains an LCD at the C-terminus that is rich in serine residues[62].Studies have shown that MED1 LCD and Mediator holocomplex undergo LLPSin vitro.Substitution of serine residues in MED1 with alanine abolishes phase separation,indicating the importance of serinemediated polar intermolecular interactions in LLPS.To examine the mechanism by which Mediator interacts with OCT4 and SOX2,in vitrodroplet assays showed that MED1 LLPS droplets readily incorporate OCT4 and SOX2[46].Furthermore,mutations of acidic amino acids in the activation domain of OCT4,which abrogate transactivation activity,also compromise its ability to phase separate with MED1.These observations indicate a functional correlation between MED1-OCT4 LLPS and transcriptional activation,and suggest LCD-dependent phase separation as a potential mechanism by which activator-coactivator complexes are assembled at gene enhancers(Figure 1B).It is worth noting that diverse transcription factors (e.g.p53,myelocytomatosis viral oncogene homolog,NANOG,estrogen receptor) can also phase separate with MED1in vitro[46].These results demonstrate that the LCD of MED1 is rather promiscuous in binding,consistent with Mediator acting as a ubiquitous coactivator.

RNA Pol II:Carboxy-terminal domain

Biochemical studies demonstrated that Mediator interacts with RNA Pol II through the carboxy-terminal domain (CTD) of the largest subunit of Pol II complex[63,64].Mammalian CTD contains 52 heptad repeats of the consensus sequence Y1S2P3T4S5P6S7.This LCD plays important roles at all steps of transcription from initiation to elongation to termination[65].Initiation requires the assembly of the preinitiation complex (PIC),composed of general transcription factors (GTFs),Mediator,and Pol II with unphosphorylated CTD,at gene promoters[66,67].As Pol II leaves the promoter and initiates transcription,the CTD becomes phosphorylated on serine 5 (Ser5) by GTF TFIIH-associated kinase,cyclin dependent kinase 7 (CDK7)[68-71].It is known that Mediator interacts preferentially with the unphosphorylated CTD[72,73] (Figure 1A).This suggests that phosphorylation of Pol II CTD may disrupt its interaction with Mediator,thus providing a mechanism by which Pol II can dissociate from PIC to initiate transcription.Two recent studies support this notion and implicated LLPS in regulating Mediator-Pol II interaction[32,52].They demonstrated that the ability of the CTD to undergo LLPS by itself,or with MED1,is disrupted by phosphorylation of Ser5 by CDK7.Therefore,Mediator-Pol II interaction and promoter-enhancer communication can be modulated by phosphorylation status of the CTD during the transcription cycle.

Positive transcription elongation factor b

After Pol II escapes the promoter,the CTD becomes hyperphosphorylated at Ser2 by CDK9 of the positive transcription elongation factor b (P-TEFb),while Ser5 previously phosphorylated by TFIIH is gradually removed by phosphatases[74-76].This switch in Ser phosphorylation pattern is thought to promote elongation by aiding the recruitment of elongation and chromatin-modifying factors to the transcribing Pol II[75] (Figure 1A).A recent study indicated that the histidine-rich LCD of the cyclin T1 subunit of P-TEFb (a heterodimer of CDK9 and cyclin T1) stabilizes the binding of P-TEFb to active genes and to the Pol II CTD to catalyze Ser2 hyperphosphorylation[77].They showed that cyclin T1 forms liquid-like puncta in the nucleus in an LCDdependent manner.Formation of these nuclear condensates and the ability of P-TEFb to hyperphosphorylate the CTD are sensitive to 1,6-hexanediol that disrupts LLPS.Consistent with these observations,the LCD is also required for phase separation of cyclin T1 with CTDin vitro.Interestingly,pre-phosphorylation of the CTD by CDK7/TFIIH significantly enhances cyclin T1-CTD LLPS,suggesting that a potential function of Ser5 phosphorylation by TFIIH after promoter escape is to prime LLPS of P-TEFb with the CTD,thereby increasing the efficiency of Ser2 phosphorylation.Taken together,these observations underscore the role of LCD-mediated interactions in regulating transition from transcriptional initiation to elongation.

Figure 1 Models depicting the mechanisms by which low-complexity sequence domain-driven interactions between transcription factors and coactivators at gene enhancers contribute to transcriptional activation.

Chromatin readers

Bromodomain-containing protein 4 (BRD4) is a critical transcriptional and epigenetic regulator in ESCs[78-80].It contains two bromodomains that recognize acetylated lysines on histone H3 and H4 that are associated with active gene promoters[81].BRD4 also acts as a scaffold for recruiting P-TEFb and chromatin remodeling proteins to facilitate transcription by Pol II[78,82,83].BRD4 has been shown to colocalize with Mediator at super enhancers that control genes important for stem cell identity[62](Figure 1B).BRD4 contains an LCD at its C-terminus with high proline and glutamine content.Studies showed that BRD4 LCD by itself can form LLPS dropletsin vitroand can be incorporated into MED1 condensates.These results suggest that LLPS between Mediator and BRD4 represents a mechanism by which they are concentrated at super enhancers[62,84].This is supported by the observation that treatment of cells with 1,6-hexanediol reduced their occupancy at enhancers.It would be interesting to examine whether binding of BRD4 to acetylated nucleosomal DNA promotes its LLPS with Mediator,due to increased valency (i.e.,cooperativity) in interactions by BRD4[85].While the mechanism by which BRD4 recruits P-TEFb to gene promoters is unknown,it is tempting to speculate that their interaction could be promoted by their respective LCD.

Stem cell-specific coactivators

Most if not all of the regulatory factors described thus far are utilized by many transcription factors to activate their target genes in both ESCs and somatic cells.Our work and others indicated that robust transcriptional activation by OCT4 and SOX2 in ESCs requires additional coactivators that are distinct from Mediator[30,86-88].Using a fully reconstitutedin vitrotranscription assay,we detected multiple novel coactivators that work in concert with OCT4 and SOX2 to activate pluripotency gene transcription.Biochemical purification of these coactivators led to the discovery of three stem cell-specific coactivators - the nucleotide excision repair protein xeroderma pigmentosum,complementation group C (XPC)[87-90],dyskerin (DKC1) ribonucleoprotein complex[86],and the ATP-binding cassette subfamily F member 1 (ABCF1)[30](Figure 1B).We found that the ability of XPC and DKC1 to stimulate OCT4/SOX2-activated transcription is strongly dependent on ABCF1,indicating a pivotal role of ABCF1 in mediating stem cell-specific transcription.

ABCF1 contains an LCD at the N-terminus that is unusually rich in charged amino acids,of which about 40% are divided between lysine and glutamic acid residues.These clusters of positively and negatively charged amino acid,interspersed with hydrophobic residues such as phenylalanine,are known to promote LLPS[91,92].Indeed,we showed that ABCF1 undergoes LLPS in an LCD-dependent manner.More importantly,the LCD is also required for transcriptional activityin vitroand in ESCs,due to its ability to selectively interact with SOX2 (but not OCT4),its co-dependent coactivators XPC and DKC1 as well as Pol II.These LCD-driven interactions are also detected at OCT4/SOX2-target gene enhancers and are sensitive to disruption by 1,6-hexanediol treatment.It is worth noting that the conformationally flexible XPC protein also contains several highly disordered regions that we found,however,to be dispensable for transcriptional activation[87-90].These observations revealed the unique ability of ABCF1 LCD to integrate multiple lines of information encoded by SOX2,XPC,DKC1,and the Pol II machinery,likely by forming a hub of these factors at target gene promoters through selective multivalent interactions (Figure 1B).In summary,cell type-specific transcriptional activation in ESCs requires an interconnected network of LCD-driven interactions by both general and cell-specific coactivators for optimal and gene-specific transcriptional activation.

Transcriptional repression

During stem cell self-renewal,developmental genes must be properly silenced.Failure to repress these genomic regions compromises stem cell identity and pluripotency of ESCs[93-96].Studies have shown that heterochromatin is essential for silencing the autosomal imprinted genomic loci,HOXgene clusters and other differentiationassociated genes[97,98].Heterochromatic regions are characterized by hypoacetylated histones and repressive modifications such as trimethylated histone H3 Lysine 9(H3K9me3),trimethylated histone H3 Lysine 27 (H3K27me3),and mono-ubiquitination of histone H2A lysine 119 (H2AK119ub)[99-103].These modifications not only control nucleosomal interactions but also regulate the association of non-histone chromosomal proteins that together influence nucleosomal packaging and gene repression.For example,heterochromatic regions are established and protected by chromatin components and trans-acting factors such as heterochromatin protein 1(HP1) and Polycomb repressive complexes 1 and 2 (PRC1,PRC2)[104] (Table 1).Understanding how histone binding proteins and histone modifying enzymes are assembled at heterochromatin will elucidate the mechanisms by which a repressed chromatin state is initiated and maintained to silence developmental genes during stem cell self-renewal,and how these heterochromatic regions are decondensed to facilitate their reactivation when ESCs undergo differentiation (Figure 2A).The highly compact heterochromatin structure has led to a number of studies that evoke LLPS for heterochromatin domain formation.

HP1

Compaction of chromatin is a key process in maintaining the repressed state of heterochromatin.HP1 recognizes H3K9me3 modifications through its chromo shadow domain and nucleates chromatin condensation[105,106].Underscoring a direct role of HP1 in chromatin condensation,artificial targeting of HP1 to a genomic locus is sufficient to cause local condensation and formation of high-order chromatin structure[107].In mammals,HP1 exists in three isoforms:HP1α,β and γ.HP1α is commonly associated with silenced heterochromatic regions,while the other two isoforms appear to have both gene silencing and activating functions[108-111].These HP1 proteins possess three LCDs (LCD1,2,and 3).Interaction between LCD1 and LCD2 has been shown to contribute to multivalent interactions with nucleo-somes[112-114].HP1α LCD1 in N-terminal extension (NTE) region has also been shown to bind DNA,which in turn induces DNA compaction and phase separationin vitroand in cells (Figure 2B).Phosphorylation of NTE of HP1α was shown to disrupt the cooperative binding between HP1α and DNA,resulting in reduced DNA compaction with less defined compaction domains and slower compaction rate[115,116].These observations are consistent with another study demonstrating that specific loss of HP1α leads to dysregulation in establishing heterochromatin domains[117].Interestingly,these phosphorylation sites are absent in HP1β and HP1γ,making regulation of HP1 LLPS and chromatin compaction by phosphorylation a unique property of the α isoform[118].However,a recent study challenges the role of phase separation of HP1 in heterochromatin formation[119].They demonstrated that HP1 proteins do not form stable LLPS droplets in mouse cells and do not regulate the size,accessibility,and chromatin compaction.Chromatin compaction tolerates loss of HP1 and H3K9me3.Relaxation of heterochromatin upon transcriptional reactivation occurs independent of HP1/H3K9me3.Future studies will be required to resolve the apparent discrepancy.

PRCs

The recruitment of PRC1 complexes to chromatin drives nucleosome compaction and transcriptional silencing[10,103,120,121].This is mediated by the chromobox 2 (CBX2)subunit of PRC1,which recognizes H3K27me3 that is deposited by histone methyltransferase Enhancer of zeste homolog 2 subunit of the PRC2 complex[122,123].Once PRC1 is recruited to H3K27me3,it monoubiquitinates H2A at lysine 119(H2AK119ub),which is essential for maintaining gene repression in ESCs[124].It has long been observed that PRC1 complexes form concentrated nuclear compartments known as Polycomb bodies[54,125].Recent studies indicated that CBX2 is responsible for PRC1 LLPS and chromatin compaction[54] (Figure 2C).CBX2 is a low-complexity disordered protein containing a serine-rich patch and positively charged amino acid rich region.It has been shown that phosphorylation of serine residues by casein kinase 2 enhances CBX2 LLPSin vitro,likely by facilitating electrostatic intermolecular interactions between phosphorylated serines and positively charged lysines.Consistent with this hypothesis,mutation of 23 Lysine and arginine residues to alanine abolishes CBX2 LLPSin vitro.Importantly,lysine to alanine substitutions in CBX2 result in axial patterning defects in mice,indicating altered Hox gene expression patterns during development[53].Thus,these results support a functional link between CBX2 LLPS and gene silencing.Taken together,these studies suggest a role of LLPS in gene repression through LCD-driven chromatin condensation,and in the proper reactivation of developmental genes in a spatially and temporally regulated manner.It appears that LLPS may play a role in concentrating factors that are critical for chromatin compaction and maintenance of the repressed chromatin state,and in excluding factors that would otherwise gain access to these repressed domains and interfere with gene silencing[115].

Figure 2 Models showing the role of low complexity domain-mediated protein condensation in gene silencing by heterochromatin formation.

INTEGRATION OF SIGNALING PATHWAYS AND TRANSCRIPTION BY LCDS

In mouse ESCs,Hippo/Yes-associated protein (YAP)/transcriptional coactivator with PDZ-binding motif (TAZ),Janus kinase (JAK)/signal transducer and activator of transcription (STAT),Wingless-related integration site (Wnt)/β-catenin,and transforming growth factor beta (TGF-β) pathways play important roles in supporting stem cell self-renewal and pluripotency[126-130].How ESCs integrate and interpret these signals and generate an appropriate transcriptional response to these cues are key to understanding fundamental mechanism governing self-renewalvsdifferentiation cell fate decision.

Hippo

The Hippo pathway controls cell proliferation and survival by regulating the activity of YAP,a transcriptional coactivator for transcriptional enhancer factors (TEFs)[131-133].The Hippo pathway regulates YAP activity primarily by controlling its nucleocytoplasmic shuttling through phosphorylation.Activation of the Hippo pathway by signals derived from cell-cell contact,mechanosensing (i.e.,substrate stiffness),and cellular stress inhibits YAP by phosphorylation at serine 127,leading to its sequestration in the cytoplasm.When Hippo signaling is inactivated,YAP translocates to the nucleus and stimulates TEF-activated transcription by forming complexes with Mediator and BRD4/P-TEFb[134,135].Other studies added complexity to this model,by showing that hyperosmotic stress also activates nemo-like kinase,which leads to YAP phosphorylation at serine 128 and,unexpectedly,translocation to the nucleus and activation of YAP-dependent genes,despite simultaneous phosphorylation at serine 127 by the Hippo pathway[136,137].

YAP is enriched in pluripotent ESCs but its level significantly decreases upon differentiation and is further inactivated by phosphorylation at serine 127[127].YAP supports stem cell maintenance by binding to key pluripotency-associated genes such asNanog,Oct4,andSox2and regulate their expression.How YAP stimulates the transcription of these genes was unknown but recent studies implicated phase separation of YAP and its paralogue TAZ as a key mechanism.In one study,YAP was shown to form liquid-like condensates with TAZ and TEF in the nucleus upon hyperosmotic stress[138].In another study,TAZ but not YAP was shown to undergo LLPS when the Hippo pathway is inhibited,even though YAP and TAZ show extensive sequence similarities[139].Formation of TAZ condensates in cells is regulated by Hippo pathway,where signals that promote nuclear retention of TAZ induce the formation of nuclear puncta that colocalize with Pol II,BRD4,MED1 and CDK9/P-TEFb,indicating that these condensates likely represent transcriptionally active compartments.Protein domain swapping experiments demonstrated that the ww and coiled-coil (cc) domains of TAZ (but not YAP) contribute to LLPS.This result is in contrast to studies by Caiet al[138] showing that YAP can in fact phase separatein vitro.Differences in protein preparation,concentration,andin vitrodroplet formation assay condition may explain the apparent discrepancy.Nevertheless,both studies demonstrated that the ability of YAP or TAZ to activate its target genes requires their LCDs,suggesting that transcriptional activation by TEF is facilitated by LCD-mediated interaction with YAP/TAZ.

Wnt,TGF-β,JAK/STAT pathways

Master transcription factors such as OCT4 and SOX2 define ESC identity in part by integrating extracellular signals at gene enhancers to drive cell-specific transcription.It has been shown that terminal effectors of the Wnt,TGF-β,and JAK/STAT signaling pathways,β-catenin,small mothers against decapentaplegics (SMADs),and STAT3,respectively,converge onto cell-specific super enhancers[140].How these enhancers“hijack” signal-regulated transcription factors are not well-understood.A recent study showed that these signaling effectors synergize with OCT4,SOX2,and Mediator by forming transcription condensates at super enhancers[141].Upon activation of the signaling pathways,β-catenin,SMADs,and STAT3 translocate to the nucleus and form condensates at super enhancer at theNanoglocus in mouse ESCs (Figure 3).By contrast,activation of Wnt signaling was not sufficient to target β-catenin to the transcriptionally silencedNanoglocus in the muscle cell line C2C12.These results indicate that recruitment of β-catenin toNanogenhancer likely requires open chromatin,active transcription,and presence of other transcription factors bound at active enhancers.Perhaps the high density of transcription factors and abundance of LCD-mediated multivalent interactions at super enhancers promote efficient concentration of signal-dependent transcription factors.Indeed,β-catenin,SMADs,and STAT3 were shown to form condensates with Mediatorin vitrothrough their LCDs.Mutations that disrupt β-catenin LLPS also compromise recruitment to its target gene enhancers and transcriptional activation,supporting a functional correlation between LLPS propensity and transcription factor recruitment and gene activation.Compartmentalization of these signaling effectors not only concentrates these factors at the appropriate enhancers but may also insulate these factors from activating the wrong targets.These LCD-dependent multivalent interactions at enhancers likely permit dynamic regulation of transcription - a key feature of regulated gene expression in response to extracellular signaling.

Figure 3 Activation of the wingless-related integration site,transforming growth factor beta,and Janus kinase/signal transducers and activators of transcription signaling pathways leads to nuclear translocation of their respective terminal signaling effectors:β-catenin,SMAD family member 3,and signal transducers and activators of transcription 3.

LCDS IN DNA REPAIR AND DNA DAMAGE RESPONSE

Unlike terminally differentiated somatic cells,the fast replication rate of ESCs makes them prone to replication stress-induced DNA damage such as double strand breaks(DSBs)[142-144].At the same time,high proliferation rate poses significant challenge to DNA repair because DNA lesions that are left unrepaired prior to cell division will be inherited by daughter stem cells and then propagated to their progenitors,likely leading to deleterious effect in development[145].Therefore,ESCs are under increased pressure to efficiently and accurately repair DNA damages.Indeed,it has been shown that ESCs express higher levels of DNA repair factors and favor high fidelity DSB repair by homologous recombination (HR)[28].It has also been shown that ESCs are hypersensitive to DNA damage and readily undergo spontaneous differentiation and apoptosis[145,146].This is likely a fail-safe mechanism by eliminating compromised ESCs from the self-renewing population.In the following sections,we will examine the role of LCD and LLPS in DNA repair and DNA damage response (DDR) and discuss how they safeguard stem cell genome integrity.

HP1 and F-actin

The abundance of repetitive sequences in heterochromatin poses unique challenges to DNA repair due to increased risks of aberrant recombination induced by DSBs,which can lead to deletion,duplication,and translocation[147].Cells have developed elaborate mechanisms to promote efficient and error-free DNA repair by taking advantage of LLPS.Upon DSB,it has been shown that phosphorylation of threonine 51 in HP1 Leads to dissociation of HP1 from heterochromatin,as evidenced by loss of binding to H3K9me3 and dispersal of HP1 nuclear puncta[115].Dissociation of HP1 Likely alters LLPS status at DSBs,which in turn facilitates chromatin relaxation and engagement of downstream effectors to initiate DNA repair[148].In another study usingDrosophilaas a model system,phase separated heterochromatic domain at DSBs appears to be able to exclude repair factors such as Ku80 that are involved in errorprone non-homologous end joining (NHEJ),and enrich factors required for the initial steps of HR repair[149].It was proposed that such exclusion mechanism favors repair by error-free HR,assuming that HR repair factors can still efficiently assess the damaged site.It will be interesting to test whether this exclusion mechanism is mediated by selective interaction between HP1 condensates and repair factors in HR but not NHEJ pathway.

Expansion of the HP1-organized heterochromatin domain is also thought to facilitate the physical relocation of the DSB DNAs to the nuclear periphery inDrosophilaor to the heterochromatin domain periphery in mouse cells,locations that are believed to be more conducive to repair by HR[150,151].Studies demonstrated that physical movement of heterochromatic DSBs depends on polymerization of F-actin and mobilization of DSB DNAs by tethering the DNA and ‘walking’ along the F-actin filaments by myosins[150].It has been shown that F-actin crosslinked by filamin spontaneously assembled into phase-separated F-actin filament bundles that can extend and contract[152].We speculate that changes in actin filament dynamics driven by phase separation could facilitate the relocation of heterochromatic DSBs to appropriate subcellular compartments as DSB repair progresses.

DDR factors and post-translational modifications

Fused in sarcoma,phosphorylation,and poly-adenosine diphosphate (ADP)-ribosylation:Fused in sarcoma (FUS,also known as translocated in liposarcoma,TLS)is one of the most studied proteins known to undergo phase separation.Its unstructured N-terminal prion-like domain is required for phase separation[153-155].In addition to its role in RNA metabolism,recent studies highlighted a role of FUS in DDR.Upon DNA damage,FUS is rapidly recruited to DSB sites[156,157].It has been shown that poly-ADP-ribosylation (PARylation) at DSBs by poly-ADP-ribose polymerase enzymes triggers the translocation of cytoplasmic FUS to the nucleus and formation of large phase separated FUS-containing compartments at DSB sites[158-160] (Figure 4A).These FUS compartments are thought to contribute to DNA repair by facilitating the recruitment of downstream effectors of DNA repair such as p53-binding protein 1 (53BP1)[161] (Figure 4B).The C-terminal arginine-glycine-glycine repeat (RGG) domain of FUS likely plays a role in LLPS by directly binding PAR[159].Therefore,the high propensity of FUS to generate large phase separated domains in cells could be due to increased valency in interaction using both the N-terminal prionlike LCD as well as C-terminal RGG domain when PAR accumulates at DSBs.It is worth to stress that these phase separated compartments are not static structures.Indeed,it has been shown that multivalent interactions by FUS can be destabilized by phosphorylation of the prion-like LCD[162] and PAR hydrolysis by PARG[158].The reversible nature of FUS LLPS compartments is likely a necessary feature of DNA repair where recruitment of repair factors to and exclusion from damaged sites must be dynamically regulated.

53BP1:In addition to PARylation at DSBs,phosphorylation of histone variant 2AX is another early event in DDR and is required for the recruitment of 53BP1 to DSB sites[163] (Figure 4B).53BP1 has been shown to generate sizeable chromatin domains in the nucleus that persist throughout the repair process[164] and is thought to recruit downstream effectors to regulate DDR and repair (Figure 4C)[163].Recent studies demonstrated that these 53BP1 domains display liquid-like properties[164,165].In one study,it showed that 53BP1 can concentrate p53 into 53BP1 condensates and activates p53-target gene expression,thereby inducing a cell cycle checkpoint DDR[164](Figure 4D).They showed that conditions that perturb 53BP1 condensate formation also compromise p53 signaling,indicating that the recruitment of p53 to 53BP1 condensates is likely important for proper activation of a p53 response in damaged cells.It will be interesting to examine whether 53BP1-dependent activation of p53 contributes to repression of pluripotency genes and activation of differentiationassociated genes observed in damaged ESCs[166].Surprisingly,while 53BP1 contains a largely unstructured N-terminal domain,it is dispensable for LLPSin vitro[164].Rather,the structured Tudor domain is required for phase separation by 53BP1.It is speculated that multivalent interactions between tyrosines (Y) and arginines (R) in Tudor domain promote LLPS,similar to what was observed regarding the role of Y/R in phase separation of the FET (FUS,Ewing sarcoma breakpoint region 1,TATA-box binding protein associated factor 15) protein family[167].Future mutagenesis studies should help clarify the LLPS mechanism employed by 53BP1.The ability of 53BP1 Tudor domain to undergo LLPS demonstrated that structured domains can also contribute to phase separation.Another study highlighted the involvement of damaged-induced long non-coding RNA (dilncRNA) at DSBs in organizing 53BP1 condensates[165].They showed that PIC assembly at DSBs containing Pol II,Mediator,and P-TEFb,and transcription of dilncRNAs facilitate molecular crowding and phase separation of DDR factors including 53BP1 (Figure 4E).Supporting this notion,inhibition of dilncRNA transcription reduces the size 53BP1 condensates and repair efficiency.Given that FUS binds RNA[168] and phase separates with Pol II CTD[155],it is tempting to speculate that transcription of dilncRNAs by Pol II at DSBs may also facilitate the incorporation of FUS into repair condensates.

Figure 4 Role of low complexity domain-driven condensate formation in DNA damage response and DNA repair.

ABCF1 and intracellular DNA sensing:ABCF1 was previously identified as a sensor for intracellular DNAs that arise from infection or DNA damage[169].Binding of these DNAs by ABCF1 triggers an innate immune response in somatic cells.However,because ESCs lack a canonical innate immune response to DNAs[170-172],the functional consequence of DNA sensing by ABCF1 in ESCs is unknown.Our identification of ABCF1 as a critical stem cell coactivator prompted us to examine whether ABCF1 can couple DNA sensing with stem cell transcription in response to DNA damage[30] (Figure 5A).We found that ABCF1 specifically binds double-stranded (ds)but not single-stranded (ss) DNAs in an LCD-dependent manner.Remarkably,binding of ABCF1 to dsDNAs dramatically stimulates LLPSin vitro.These results suggest that upon DNA damage,ABCF1 may preferentially form condensates with dsDNAs in damaged ESCs instead of binding SOX2 and Pol II.Consistent with this model,we found that ABCF1’s interaction with SOX2 and assembly of Pol II transcription machinery at pluripotency gene promoters are disrupted upon DNA damage,resulting in downregulation of pluripotency genes critical for stem cell maintenance (Figure 5B).We propose that ESCs may leverage ABCF1’s ability to switch between transcription factor and dsDNA condensates to modulate pluripotency gene transcription.Direct coupling of DNA sensing and stem cell-specific transcription via ABCF1 may represent an effective strategy to safeguard genome integrity by eliminating compromised ESCs from the self-renewing population through enforced differentiation.

Figure 5 ATP-binding cassette subfamily F member 1 couples stem cell-specific transcription with DNA sensing in Embryonic stem cells.

CONCLUSION

A growing number of factors have been shown to form condensates with the MED1 subunit of the Mediator complex.Less clear are the mechanisms by which MED1 forms these numerous,functionally distinct condensates.Changes in their composition upon signaling pathway activation,and at different stages of gene transcription where“cargoes” are handed off from one condensate (e.g.initiation) to another (e.g.elongation) must be tightly regulated.A key challenge is how to avoid accidental mixing of these MED1 condensates.Post-translational modifications of the CTD of Pol II provide one such strategy wherein different phosphorylated forms of the CTD (Ser5vsSer2) condense preferentially with regulatory factors in initiation or elongation.In addition,we propose that the requirement of coactivators such as ABCF1 to form stem cell-specific multivalent interactions adds another layer of specificity for gene regulation in ESCs.

Evidence of LLPS in cells,particularly with respect to transcription factors,relies in part on observations of their phase separation behaviorsin vitro,that they are spherical in shape,can fuse and fission,and allow exchange of biomolecules.However,these properties are not unique to LLPS.Indeed,a recent study on Pol II compartment formation during herpes simplex virus type 1 infection highlighted that,despite sharing several properties that are consistent with phase separated condensates,these Pol II compartments are formed by non-specific interactions with viral genomic DNA,distinct from behaviors typically attributed to Pol II condensates[173].In another study,it was shown that at physiological concentration,TFs activate Pol II transcription at endogenous genomic loci by forming dynamic LCD-driven hubs in the absence of LLPS[31].Therefore,there are likely multiple pathways with which clustering of biomolecules in cells can be achieved without undergoing LLPS.In fact,a recent study provides evidence that formation of transcription factor droplets can actually be counterproductive to gene activation[174],suggesting that the topology and binding dynamics of multivalent interactions are critical for protein function in transcription and likely other cellular processes.For discussion on the role of phase separation in biological reactions,we recommend several excellent reviews on evidence for and against LLPS in cells[23,175-177].

Whether or not these LCD-driven domains in cells meet the criteria of LLPS,it is evident that an intricate network of multivalent interactions controls various steps in transcription,their integration with signaling pathways,and in DNA repair and DDR -processes essential for maintenance of stem cell pluripotency and genome integrity.Transient and weak protein-protein and protein-nucleic acid interactions mediated by LCDs in regulatory factors enhance efficiency of biological reactions by enriching relevant factors in distinct hubs or compartments,specificity by combinatorial assembly,and dynamic regulation in response to changing cellular environment by modulating LCD-LCD interaction affinity and specificity.

ACKNOWLEDGEMENTS

We apologize to colleagues whose work was not mentioned in this manuscript,due to the limited scope of this review.The authors thank Agarwal S,Chong S,and Zhang Z for valuable discussion.