-
PDF
- Split View
-
Views
-
Cite
Cite
Yugo R Kamimura, Motomu Kanai, Chemical Insights into Liquid-Liquid Phase Separation in Molecular Biology, Bulletin of the Chemical Society of Japan, Volume 94, Issue 3, March 2021, Pages 1045–1058, https://doi.org/10.1246/bcsj.20200397
- Share Icon Share
Abstract
Liquid-liquid phase separation (LLPS) in living organisms is a recently emerging biologic principle that may dramatically alter current perceptions of cellular systems. Various proteins, RNAs, and other biomolecules undergo LLPS, exhibiting various cellar functions. The field is still immature, however, there is no consensus regarding the basic experimental techniques used for characterizing the phenomenon, knowledge of the physicochemical basis driving and regulating LLPS in cells is insufficient, and very little is known about potential chemical interventions for LLPS. Addressing these deficiencies requires chemical approaches, and will markedly advance drug discovery, molecular biology, and medicine. Here, we introduce the basic biology of LLPS and present challenges in the field from a chemical viewpoint.
1. Introduction
Liquid-liquid phase separation (LLPS) is a well-known physicochemical phenomenon in polymer science that has recently emerged as a fundamental infrastructure governing various cellular processes, including gene regulation,1,2 metabolism,3 and the stress response,3–5 in living organisms. LLPS in cells was first discovered by Brangwynne et al. in 2009, as a phase-separated condensate of RNAs and proteins.6 They found that P granules, which are expressed in early embryos of C. elegans and have long been known as granular structures formed by RNAs and RNA binding proteins,7–10 are not conventional biomolecular complexes, but are rather phase-separated liquid droplets (Figure 1A).6

A) P granules formed by GFP fusion PGL-1 (green) in C. elegans. Reprinted in part with permission from The American Association for the Advancement of Science (AAAS). B) Various functions of biomolecular condensates formed by LLPS. The circle designates the liquid droplet. Activation: Biomolecules (orange hexagon) can be specifically activated by activator molecules, or its substrates (blue triangle) in the droplet. Inactivation: Biomolecules (orange hexagon) can be specifically inactivated in the droplet through the exclusion of activator molecules, or its substrates (blue triangle) from the droplet. Metabolism: Enzymes involved in a metabolic pathway (orange hexagon, blue triangle, and green pentagon) can be specifically concentrated in the droplet, allowing for the promotion of successive reactions. Filtration: Specific molecules can be selectively transported into the cell through the droplet on a cell surface acting as a filter of molecules. C) Four essential functions of biomolecular condensates. D) Scaffold-Client model of biomolecular condensates. The client that binds to excess scaffold is recruited to the droplet. Reprinted with permission from Elsevier. E) Promotion of microtubule formation by microtubule associated proteins (MAPs) and tubulin recruitment and concentration to the centrosome. Gel-like scaffold is formed by scaffold protein (orange hexagon) around the centrioles (bule cylinders), then MAPs and tubulin (blue triangle) are selectively concentrated to the scaffold. This leads to the promotion of microtuble formation. F) Microscopy image showing that droplets comprising tSUC39H1/HP1b complex (red) and H3K9me3 12xNA (blue) excluding mTFIIB (green). The white arrow is irrelevant for this review. Reprinted with permission from Elsevier. G) Switch-like property of the liquid droplets. Phosphorylation of Pol II CTD caused transition of the Pol II from the initiation condensate to the elongation condensate.
Since then, many cellular architectures, including those in mammalian cells,2,4,5,11,12 have proven to be phase-separated droplets. For instance, the nucleolus,13 where ribosome biogenesis take place;14 super-enhancers,15 which comprise a cluster of enhancers with high-level binding of transcriptional coactivators;16 and heterochromatin,11,17 which is a cluster of repressed genes,18 are all formed by LLPS.
Phase-separated droplets composed of biomolecules are also called biomolecular condensates,19–22 and exhibit various unique functions either inherently or synthetically, such as activation23–25 or inactivation17 of certain molecules; buffering of molecular concentrations in the cell;26,27 sensing of stimuli;3–5 compartmentalization;12,28 promotion of successive reactions;29 molecular filtration30 (Figures 1B and 1C, see section 2 for details). On the other hand, dysfunctions of LLPS are related to various diseases including neurodegenerative diseases31–33 and cancers.34,35 Thus, biomolecular condensates are an attractive novel target for chemical intervention, such as drug discovery or tools for chemical biology.
2. Intracellular Liquid-Liquid Phase Separation: An Overview
These biomolecular condensates are mainly composed of proteins and nucleic acids, especially RNAs.34,36 Droplet components can be divided into two categories according to their role in droplet formation:37 1) scaffolds, which form the backbone of the droplet, and are thus an essential component of the condensate, and 2) client, which is not necessary for the droplet architecture, but recruited to the droplet through affinity with the scaffolds (Figure 1D).
Among the functions of biomolecular condensates mentioned in section 1, we propose the following essential four properties that lead to the various functions of phase separated biomolecular condensates.
1) Concentration of specific biomolecules. Biomolecular condensates contain highly concentrated scaffold molecules.23,38–40 Because the architectural component is highly concentrated in its droplet, client molecules that have affinity with the scaffold are also highly condensed in the droplet.23,37 The degree of concentration ranges from a few-fold to 104-fold, depending on the protein species and conditions.23,38–40 This property promotes various reactions in the droplet.23–25
Woodruff et al. showed that centrosomes, which play a central role in the formation of mitotic spindles during mitosis, are biomolecular condensates.23 Microtubule associated proteins (MAPs) and tubulins exist in high concentrations in centrosomes as clients, thereby accelerating microtubule formation (Figure 1E).23
As a non-obvious consequence of the concentration effect, biomolecular condensates can be a novel mechanism of the non-linear response. In a model study using polymers of small ubiquitin-like modifier (poly-SUMO) and polymers of the SUMO interaction motif (poly-SIM), which complexes by SUMO-SIM binding, as scaffold proteins, Banani et al. demonstrated that a slight deviation in the scaffold stoichiometry from a 1:1 ratio leads to switch-like alterations of client recruitment (Figure 1D).37 This stoichiometry change can take place through alterations of the protein concentrations or through alterations of the scaffold valency by covalent modifications such as cleavage of SUMO. Cells can control these parameters by expression control or covalent modification enzymes, thereby controlling the composition of the condensates. The partitioning efficiency of client molecules can also be controlled by the apparent affinity to the scaffold; for example, by client valency alterations.
2) Buffering of the biomolecular concentration. Biomolecular condensates can buffer the change in the concentration of its scaffold proteins.34–36 Because the stable concentrations in the droplet and in the bulk phase are determined by the thermodynamic properties of the system, fluctuations in the concentration of the scaffold protein are buffered by the size change of the droplet.34,36,41 This property may enables the maintenance of homeostasis34,36 or reduction of noise in the cells, as demonstrated by Klosin et al.26
3) Sensitivity to stimuli. Biomolecular condensates can be sensitive to changes in the environment, such as temperature,42,43 pH,43 concentrations of salts38,43 or small molecules,15,44 or the state of its components, such as the charge40 or concentration.38,39,45 This leads to the exceptionally dynamic nature of biomolecular condensates, unlike organelles confined by the membrane. Biomolecular condensates can form and dissolve promptly according to transient events such as the cell cycle46 or stress stimulus.47 For example, stress granules appear within a minute by treating cells with sodium arsenite (NaAsO3) as a stress stimulus, and disappear within a few minutes when the stress is removed.47
4) Compartmentalization of biomolecules. Biomolecular condensates work as a compartment to segregate biomolecules like normal organelles do, with chemical boundaries but without a mechanical barrier.48 Wang et al. showed in vitro that droplets comprised of suppressor of variegation 3–9 homolog 1 (tSUV39H1) protein, heterochromatin protein 1-β (HP1β), and histone H3 lysine 9 trimethylated 12-mer nucleosomal array (H3K9me3 12xNA), which are all implicated in gene silencing,49 exclude mouse transcription factor IIB (mTFIIB) (Figure 1F).12 tSUV39H1 is a histone lysine N-methyltransferase50 and HP1β is a protein with an important role in the packing of heterochromatin marked by histone lysine methylation.49 The nucleosomal array (NA) is a partial chromatin structure and is thus often used as a model of chromatin in vitro.21,38,51,52
The segregation mechanism is governed by the physicochemical properties of proteins as discussed in this review. Therefore, functional state of proteins marked by posttranslational modifications (PTMs) can also be critical, in addition to molecular species.53 PTM, such as acetylation,38 methylation,12 and phosphorylation,2 may result in the segregation of modified proteins from non-modified proteins to adapt to a different functional state.
RNA polymerase II (Pol II) transcribes RNAs from DNAs54 and serine residues in the C-terminal domain (CTD) of Pol II are phosphorylated when transcription is initiated.55 Guo et al. showed that serine phosphorylation in CTD is a mechanism for switching the droplet localization of Pol II (Figure 1G).2 Before phosphorylation, Pol II was dissolved in initiation condensates, which contain various proteins related to transcriptional initiation. After phosphorylation and transcriptional initiation, Pol II became insoluble to the initiation condensates and dissolved to elongation condensates, which contain concentrated splicing factors, thus improving the efficiency of RNA processing. This example demonstrates that changes in the physicochemical properties of proteins by introducing PTMs can lead to dramatic responses (Figure 1G).
By combining these four functions, biomolecular condensates enable dynamic and prompt control of various biologic processes. For example, inactivation of biomolecules and promotion of successive reactions can be achieved by a combination of concentration effects and compartmentalization effects. Concentration effects and sensitivity to thermodynamic changes can facilitate stimulus sensing.
3. Physicochemical Basis of LLPS
Phase-separating biomolecules typically interact with each other by weak multivalent interactions.20,35,56,57 The mechanism of LLPS can be explained by the negative contribution of the internal energy change to the free energy change of the system upon mixing, caused by homophilic interactions.57,58
Together with gain-of-entropy, the free energy gain by releasing internal energy upon solvation is a driving force for the formation of a single-phased solution by dissolving molecules in a solvent. The free energy change vs. the volume fraction of the solute of such a system upon mixing can be depicted as a concave free-energy curve (Figure 2A).

A) The change in the Helmholtz free energy upon mixing, in the case of a low χ parameter value. The horizontal axis shows volume fraction ϕa, and the vertical axis shows free energy Fmix. B) The change in the Helmholtz free energy upon mixing, in the case of a high χ parameter value (blue line), compared with that in case of a low χ parameter value (orange line). C) The difference in the free energy curve in the case of various χ parameter values. The penalty on the internal energy pushes up the curve as χ increases. D) A part of the free energy curve in the case of a high χ parameter value. The stability of the system and progression mode of LLPS differ depending on the initial volume fraction. E) Behavior of a mixture with a concave energy curve. Any LLPS will lead to destabilization of the system in this case, thus leading to a single-phase solution. F) Behavior of a mixture with a convex region on its energy curve. The marks with a cross represent the binodals, and the circles represent the spinodals. The system can be further stabilized by LLPS, when ϕ0 is inside the binodals, leading to a two-phase mixture. G) Phase diagram and the phase behavior of mixtures in each condition. A solution that belongs to a one-phase regime will not undergo LLPS, whereas a solution that belongs to a two-phase regime may undergo LLPS, the mode of which depends on the initial conditions. The critical point is the condition above which LLPS never occurs.
On the condition that there is no change in the liquid volume upon mixing, a change in the Helmholtz free energy upon mixing, ΔFmix, can be generally expressed by the following equation,
where ΔUmix, T, and ΔSmix represent a change in the internal energy, temperature, and the change in the entropy, respectively. According to the Flory-Huggins theory, which gives a lattice-based, mean field approximated model of a polymer solution,59,60 ΔUmix and ΔSmix of a 2-component system consisting of solute A and solvent B, can be expanded as follows. ΔUmix will be
where ϕa and ϕb are the volume fractions of molecular species A and B, respectively; χ(T) is Flory’s χ parameter; N is total number of molecules; n is amount of substance; kB is the Boltzmann constant; and R is the gas constant. In the most simplified model, the χ parameter mainly reflects the interaction energies between the molecules and is defined by the following equation,
where z is a constant depending on the model, and εij shows the change in internal energy by interaction between molecules i and j. As shown in eq 4, homophylic interaction, which means small values of εaa and εbb, and/or a large value of εab, implies a large χ parameter value, leading to the tendency to phase separate. The effective χ parameter value depends on temperature, and other factors that might affect effective interaction energies, such as salt concentration, pH, and PTMs. If the response of the χ parameter to change in such factors is steep, the condensate becomes sensitive to environmental changes (i.e., sensitivity described in section 2). ΔSmix can be expressed as follows,
where m represents the number of lattice cells that one molecule of solute A will occupy. From the above equations, ΔFmix can be expressed as follows.
In the remainder of this section, we focus on how the χ parameter is linked with the behavior of the solution.
When χ is small enough, i.e., when the molecular interactions favor a mixing state, dΔFmix/dϕa is always positive according to eq 7, affording a concave curve for free-energy dependency on volume fraction ϕa (Figure 2A). Assume a solution of A, which follows the concave free-energy curve. Let the volume fraction of A be ϕ0 and the free energy of the whole system at the initial state be F0 (Figure 2E). If the mixture phase separates into two phases in which the volume fraction of A in bulk phase is ϕ1 and the volume fraction of A in condensate is ϕ2, the free energy after the phase separation Fsep will be on the line segment between F(ϕ1) and F(ϕ2). Therefore, when the free energy curve is concave, LLPS in any condition will lead to higher free energy Fsep compared with initial free energy F0, and the mixture immediately returns to single-phase solution.
On the other hand, when χ is large enough, i.e., when there are strong enough homophilic molecular interactions, dΔFmix/dϕa can be negative at a certain volume fraction range, by negative contribution of the internal energy factor to the free energy. Under homophilic interactions, the system gains energy when the same types of molecules are in close proximity, and/or the system loses energy when different types of molecules are in close proximity. Thus, in this situation, the internal energy change disfavors the mixing state. The penalty of internal energy upon mixing is higher when χ is larger, which implies a convex region of the free-energy curve (Figure 2B, C). In this case, if a solution of A undergoes phase separation, the total free-energy of the resulting system Fsep can be lower than the initial free-energy F0 (Figure 2F), unlike in the case of a small χ parameter value (Figure 2E). The free energy of the system will be lowest when the volume fractions of the dilute phase and dense phase are at the contact points of the common tangent. The contact points are called binodal points, and because binodal points are not dependent on the total concentration of the molecules, biomolecular condensates buffer the concentration of its components by changing their size (i.e., buffering function described in section 2).
Even when the value of the χ parameter is favorable for LLPS, whether or not LLPS takes place and how it proceeds will depend on the initial volume fraction of the components. As a minimal requirement, the initial volume fraction must be inside the binodal points for phase separation to take place. When the volume fraction is inside the binodal points, the behavior of the phase separation still differs depending on the volume fraction. Even if the energy of the initial state is not the minimal value, the system will not necessarily undergo spontaneous phase separation.
Figure 2D shows a part of the free-energy curve in the case when χ is sufficiently large. When the volume fraction of the initial composition is ϕms, assume that thermal fluctuation causes a local phase separation in which the volume fraction of the resulting two phases is ϕms′ and |$\phi ''_{\text{ms}}$|, respectively. In this situation, the free energy after phase separation will be Fms′, which is on the line segment between F(ϕms′) and F(ϕms′′). Fms′ is higher than the initial energy Fms, so the two phases will immediately mix and return to the initial mixing state. Phase separation from this state requires nucleation, which can trigger phase separation; therefore, the mixing state is metastable in this case. On the other hand, when the volume fraction of the initial composition is ϕus and thermal fluctuation causes local phase separation, in which the volume fraction of resulting two phases are ϕus′ and ϕus′′, the free energy after phase separation will be Fus′, which is on the line segment between F(ϕus′) and F(ϕus′′). Fus′ is lower than the initial energy Fus, so the two phases will further separate continuously. The mixing state is metastable in the concave region of the Fmixvs.ϕ curve and unstable in the convex region of the curve. Thus, spinodal points, the thresholds at which metastable and unstable alter, are derived by the following equation.
Plotting the binodal points and spinodal points at each temperature, salt concentration, and so on, on the plane of the parameter vs. volume fraction will give a phase diagram, which shows the phase behavior of the solution (Figure 2G). The curve generated by connecting binodal points is called a binodal curve or coexistence curve, and the curve generated by connecting spinodal points is called a spinodal curve. If the initial condition is outside the binodal curve, the mixing state is stable and will not undergo phase separation (one-phase regime). On the other hand, if the initial condition is inside the binodal curve (two-phase regime), the system can undergo phase separation. If it is between the binodal curve and spinodal curve, the system will undergo phase separation by nucleation and growth processes, and if it is inside the spinodal curve, the system will undergo spontaneous and continuous phase separation by thermal fluctuation, referred to as spinodal decomposition (Figure 2G).
This explanation using a two-component system can be expanded to multi component systems.
4. Interactions Driving Biologic LLPS
As mentioned above, LLPS is mediated by weak multivalent interactions.20,35,56,57 Therefore, biomolecules that undergo LLPS can be modeled as multiple interacting motifs called stickers, and linkers connecting them; sticker represents a protein, a peptide motif, or an amino acid that interacts with other stickers, and linker modulate the interactions between stickers through conformational flexibility.35,61–63
Biomolecules that undergo LLPS can be classified into three categories: 1) proteins containing intrinsically disordered regions (IDRs), regions with no defined three-dimensional structure (Figure 3A, Top);45,64,65 2) proteins containing multiple folded binding domains that can bind to their binding partners (Figure 3A, Middle);12,37,39,66 and 3) nucleic acids, especially RNAs (Figure 3A, bottom).67,68 These categories are not exclusive, and are often observed in combination, such as RNA binding proteins containing IDRs67 or proteins with multiple binding domains containing IDRs.12

A) Classification of biomolecules that undergo LLPS. (Top) Proteins with IDRs (designated as black curved lines). (Middle) Proteins with multiple stickers (designated as hexagon and triangle). (Bottom) Nucleic acids (designated as helix). B) (Top) Modes of interaction that drive LLPS. (Bottom) Sidechain structure of amino acids often involved in interaction between IDRs, and the modes of interaction each residue can have. C) (Top) Amino acid composition of FUS1 protein. A vertical bar in each row other than the bottom row indicates each amino acid that belongs to the category. The residues colored red emphasize the abundance of residues characteristic to FUS1 (Gly, Ser, Gln, Tyr, Arg). The red regions in the bottom row and the yellow regions indicate IDRs. (Bottom) Domain structure of FUS1, which consists of prion-like domain (PLD) and RNA binding domain (RBD). D) Amino acid composition of NICD. Residues that have high impact on LLPS of NICD (Tyr, Arg, Asp, Leu, Met, Trp) are colored red. E) Amino acid composition of MED1. Residues that have high impact on LLPS of MED1 (Ser, Arg, Lys) are colored red. F) Amino acid composition of hnRNP1A. Residues colored red emphasize the abundance of the residues characteristic to hnRNPA1 (Gly, Phe, Tyr). G) Distribution of charged residues in NICD and its mutants. Red and blue bars represent negatively charged and positively charged amino acid residues, respectively. H) Distribution of aromatic residues in hnRNP1A and its mutants.
LLPS derived from proteins bearing multiple folded binding domains (stickers) and inert flexible IDR linkers are relatively straightforward to model and analyze because there is a clear correspondence of the interactions between interacting stickers, they do not have microscopic cooperativity, and the interactions can be recognized by its three-dimensional structure. Phase-separating systems composed of poly-SUMO/poly-SIM37 or poly-FKBP (FK506-binding protein)/poly-FRB domain (FKBP-rapamycin binding domain)66 are examples of such a system, which phase separate by interactions between folded stickers connected by flexible linkers. A phase-separating system composed of poly-PRM (proline-rich motif)/poly-SH3 (SRC homology 3 domain)39 is also this kind of system, in which a specific sequence, PRM, on the polypeptide is specifically recognized by the folded SH3 domain.69
On the other hand, LLPS derived from interactions between IDRs are less well understood because of the complexity of the interactions;70 the interactions involve multiple modes, such as cation-π,71 ionic,40 π-sp2 (including π-π interaction),70,72 dipole-dipole (including hydrogen bonding),70,73 and hydrophobic interactions,47,70 without furnishing fixed and defined three-dimensional structures and interacting in a promiscuous manner.58,61,74 Additionally, the interactions can influence each other to form further complicated interaction networks and microscopic cooperativities.70 RNA plays unique and important roles in forming LLPS by its ability to bind to multiple RNA binding domains and by its nature as a multivalent anionic molecule (Figure 3A, bottom).35,36,67,75 In this section, we focus on the current knowledge of principles that govern the interactions between IDRs.
The amino acid compositions in IDRs are often biased; aromatic amino acids (Tyr, Phe), charged amino acids (Arg, Lys, Glu, Asp), polar amino acids (Gln, Asn, Ser), and a flexible amino acid (Gly) are frequently observed in IDRs (Figure 3B). These amino acids often constitute motifs such as YG/S, FG, RG, GY, KSPEA-, SY-, or Q/N-rich regions, and repeatedly show up in the amino acid sequences.56 Amino acid residues, Arg, Lys, Glu, and Asp,40 or motifs comprising those amino acids interact with each other through electrostatic interactions; Tyr, Phe, Arg, and Lys mediate cation-π interactions;64 Glu, Asn, Ser, and Tyr mediate dipole-dipole interactions;56,70 Tyr, Phe, and Gln mediate π-sp2 interactions (including π-π interaction);70,72 and Leu, Met, Tyr, and Trp mediate hydrophobic interactions40,56,76 (Figure 3B). These interactions hierarchically interplay40,56,64 to regulate the formation and dynamics of the droplet according to the physicochemical characteristics of interaction modes, such as interaction range,40,56 and to environment.
Though no comprehensive principle has been established thus far, interactions driving LLPS are intensively studied. Among them, fused in sarcoma (FUS) protein, a prion-like RNA binding protein,77 is a typical example of intrinsically disordered proteins. FUS family proteins, including FUS, contain prion-like domains (PLD) and RNA binding domains (RBD).64 FUS PLD is a disordered domain and highly enriched by non-charged polar amino acids, such as Gly, Gln, Ser, and aromatic amino acid, especially Tyr. The RNA binding domain consists of a folded RNA recognition motif (RRM) and a disordered region enriched by Arg and Gly (Figure 3C).
Wang et al. analyzed FUS family proteins to elucidate the molecular grammar governing phase behavior. They discovered that proteins with PLD-RBD architectures contain especially high numbers of Tyr and Arg, and that the saturation concentration above which LLPS occurs was determined by cation-π interactions between Tyr and Arg.64 Mutation of Tyr to Phe, or Arg to Lys weakened the interaction and increased the saturation concentration of FUS. The preferred interaction was in the order of Tyr-Arg > Tyr-Lys–Phe-Arg > Phe-Lys. The difference between the abilities of Arg and Lys is likely due to the directivity of Arg defined by the delocalized electron cloud across the guanidium plane, underlining the sensitivity of LLPS to the interaction mode.
Assuming Tyr and Arg act as stickers, Wang et al. also analyzed linker regions within FUS family proteins.64 Compositional analysis revealed that Gly is the most enriched amino acid, followed by Ser. Additionally, the number of Gln was most strongly negatively correlated with the number of Gly among other amino acids. Consistent with their roles as linkers, mutation of Gly or Ser to Ala, or Gln to Gly did not significantly change the saturation concentration of FUS. In contrast to the effect on the saturation concentration, mutation of Gly to Ala led to a two-order of magnitude slowing in the fusion rate of the droplets, suggesting that flexibility of the linker maintain the fluidity of the droplet, although the possibility that the linker hydrophobicity affects the droplet properties cannot be excluded. Furthermore, mutation of Ser to Ala slowed hardening of the droplets over time. Mutation of Gln to Gly even strongly prevented hardening. Though these residues cannot drive LLPS, as shown by the inability to affect the saturation concentration, these residues contribute to the properties of the droplet once it forms. This demonstrates the hierarchy of interactions underlying the LLPS of FUS.
FUS is also known to phase separate without an RBD region, but only by its PLD, though the saturation concentration is two orders of magnitude higher than full-length FUS.64,76 Burke et al. showed that LLPS of FUS PLD more readily occurs when the ionic strength of the medium is increased from no-salt to 250 mM of NaCl.76 This phenomenon is likely due to the salting out of hydrophobic Tyr residues of FUS PLD, suggesting that LLPS of FUS PLD is mainly driven by hydrophobic interactions. In contrast, LLPS of full-length FUS are not affected by the addition of NaCl up to 150 mM, but are slightly attenuated at 300 mM. Therefore, interactions outside of FUS PLD would contribute to LLPS by full-length FUS, consistent with findings from Wang’s study. Additionally, the moderate salt concentration-dependency of LLPS of the full-length FUS suggests that LLPS of full length FUS is not predominantly driven by electrostatic interactions.
For further insights into the interactions of FUS PLD that drive LLPS, Murthy et al. combined NMR spectroscopy and computational simulation.70 From doubly edited nuclear Overhauser effect (NOE) experiments, they showed that Ser, Tyr, Gly, and Gln all exhibit similar intermolecular NOE patterns, except for Gly which exhibited a slightly different pattern and weakened intensity. Additionally, they showed that interactions are dispersed along the PLD and only a little bias was observed at a specific sequence region by conducting (1H)13C HSQC-NOESY-1H15N HSQC experiments and paramagnetic relaxation enhancement (PRE) NMR experiments using A16C, S86C, or S142C mutants of FUS PLD labeled by MTSL, which is a thiol-specific spin-labeling reagent.78 These results suggest that many amino acid residues, not a specific amino acid residue, contribute to the intermolecular interactions that drive LLPS.
In response to the strongest NOEs by Ser, Tyr, and Gln, Murthy et al. simulated the number of hydrogen bonds by molecular dynamics simulations and found that Gln-Gln hydrogen bonding was the most abundant, which suggests that Gln residues most strongly contributed to the hydrogen bonding of side chains.70 They made a few FUS PLD mutants containing 8 additional Ser instead of Gln or 12 additional Gln instead of Ser and conducted phase separation experiments in the presence of 1 M NaCl. As expected, reducing the number of Gln residues required a higher concentration of the protein for LLPS to occur, and vice versa. Therefore, they concluded that Gln contributes to LLPS of FUS PLD putatively by hydrogen bonding. They also focused on the salting out of FUS PLD: all atom simulations to analyze hydrophobic effects revealed that Tyr formed hydrophobic contacts with both Tyr and Gln. These residues also form π-sp2 interactions with contributions from backbone peptide bonds and side chains. As experimental validation, they probed the effects of monovalent Hofmeister cations and anions. LLPS of FUS PLD were strongly affected by Hofmeister anions; chaotropic salt sodium iodide almost abolished the LLPS and kosmotropic salt sodium fluoride enhanced the LLPS. The salts also affected LLPS formation by full-length FUS in a similar manner,70 supporting Burke’s findings indicating the importance of hydrophobic effects.76 Murthy et al. further implied that because Tyr and Gln are both involved in all hydrogen bonding, hydrophobic effects, and π-sp2 interactions, there might be some cooperativity between these interactions.70
These results suggest that various types of interactions collaboratively regulate LLPS of FUS, and those interactions hierarchically interplay. Cation-π interactions are likely the driver interactions of LLPS, and the other interactions, such as hydrophobic interactions, hydrogen bonding, π-sp2 interactions, and electrostatic interactions64 modulate the phase behavior. The dominance of each interaction may differ depending on the conditions, such as the salt concentration. Further experimental data and analysis are needed to integrate current knowledge, and fully elucidate the physicochemical principles underlying LLPS of FUS protein.
Although many phase-separating proteins undergo LLPS without the assistance of other proteins,27,45,64 there is an example of complex coacervation. Pak et al. analyzed LLPS of the nephrin intracellular domain (NICD).40 NICD is a negatively charged IDR that does not phase separate on its own but undergoes complex coacervation. The NICD phase separates only when mixed with positively charged molecules, such as poly-Lys or positively charged supercharged GFPs (scGFPs), a series of mutated variants of green fluorescent protein (GFP) that have a net surface charge up to +36.79 As well as in vitro, NICD forms complexes with cationic biomolecules in cells that accumulate to form a structure called the nuclear body, by LLPS. Unlike FUS, NICDs are not strongly enriched by one or two specific amino acids characteristic to IDRs, but has a similar composition to that of the average, generic IDR composition (Figure 3D). Thus, Pak et al. studied the effects of 6–12 amino acid segment deletions on the formation of liquid droplets, instead of mutating specific amino acids to elucidate the determinants of LLPS of NICD. As a result, single deletion of almost any segment resulted in a reduction of the phase separation.40 Double deletion of two segments led to greater reduction of phase separation with only a single exception. Shuffling residues within each segment either did not affect the ability or improved the ability to phase separate. To determine the stickers, they performed a statistical analysis for correlating deleted residues to the droplet formation ability. Deletion of Tyr most strongly affected the droplet formation ability, followed by Arg, Leu, Met, Trp, and Asp. Mutation of aromatic amino acids or hydrophobic amino acids to positively charged Lys resulted in a greater reduction of droplet formation than deletion. Therefore, the driving forces of phase separation would be charge neutralization, which is a long-range interaction, and farther modulations are achieved by short-range interactions, namely, aromatic interactions and hydrophobic effects.
The hierarchy of amino acid residues affecting the formation of LLPS is likely affected by its interaction range,40,56 however, it is strongly context-dependent and cannot be determined by mere structures of its side chain or physicochemical properties. Mediator complex subunit 1 (MED1) is a transcriptional coactivator and contains an IDR enriched by Ser (Figure 3E).15 Sabari et al. demonstrated that mutating Ser to Ala abolished the ability to form droplets under the same conditions as wild-type MED1.15 Therefore, Ser is indispensable for driving LLPS for MED1, unlike the case of full-length FUS protein.64 Mutating positively charged residues, Arg and Lys, to Ala abrogated the ability to form droplets, whereas mutating aromatic residues, Phe and Tyr, to Ala did not abolish the ability to form dropolets.27 This tendency is also different from FUS.64
Modes of interactions between IDRs are coded by amino acid sequences, but it is not the precise sequence or the amino acid composition that governs the interaction. Pak et al. proposed that other than its interacting residue/motif, its valency, or the linker residue/motif, the patterning of stickers has a great effect on the behavior of IDRs.40 They modified the distribution patterning of negatively charged amino acid residues in NICD, Asp and Glu, to patchy distribution patterning with a high local charge density through shuffling charged residues and non-charged polar residues (Figure 3G, NICDCC). This modification in the charged amino acid patterning led to enhanced nuclear body formation.40 In contrast, an NICD mutant containing a more scattered charge distribution patterning attenuated the nuclear body formation (Figures 3G, NICDCS). Two other mutants with a cluster of negatively charged residues at one region were also tested (Figure 3G, |$\text{NICD}_{\text{CB}_{\text{N}}}$| and |$\text{NICD}_{\text{CB}_{\text{C}}}$|). One of them, |$\text{NICD}_{\text{CB}_{\text{C}}}$|, phase separated to form nuclear bodies in cells with an equivalent or higher efficiency than native NICD, showing that multiple clusters are not necessarily required for droplet formation. The mutant with a negatively charged cluster at its N-terminus (Figure 3G, |$\text{NICD}_{\text{CB}_{\text{N}}}$|) exhibited greatly diminished phase separation ability. The reason for this exception is likely due to the high level of collapse, which can limit the access of counterions, driven by intramolecular interactions between the negatively charged block and the positively charged residues enriched at the N-terminus.
Similarly, Martin et al. showed that the distribution of interacting residues is important for the regulation between LLPS and aggregation regimes.61 They used IDRs of heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1), a FUS family protein enriched by Gly, Tyr, and Phe residues (Figure 3F). 1H-15N Heteronuclear single quantum coherence (HSQC) NMR spectra indicated that the protein does not have a fixed secondary structure. Nevertheless, they observed NOEs between the side chains of Tyr and Phe, suggesting the transient clustering of aromatic residues acting as stickers. Based on this finding, they constructed mutant proteins with patchy and scattered distribution patterns of aromatic residues, respectively (Figure 3H). They found that the protein with a scattered sticker distribution preferred to form LLPS with higher liquidity (Figure 3H, Aroperfect). Even if the number of stickers is the same, proteins with a patchy distribution of aromatic residues formed amorphous aggregates (Figure 3H, Aropatchy).61 Martin et al. pointed out that distribution of aromatic residues of PLD of RNA binding proteins such as FUS, TATA-box binding protein associated factor 15 (TAF15), Ewing sarcoma breakpoint region1 (EWSR1), hnRNPA2B1, and hnRNPA3 is strongly conserved among the species, although the overall sequence is poorly conserved, which shows the importance of sticker patterning.
To summarize this section, the following five factors are critical for regulating LLPS of biomolecules driven by IDRs: 1) sticker valency and strength, 2) mode of interaction, 3) linker flexibility, 4) propensity of linkers to form hydrogen bonds, and 5) patterning of motifs. Modes of interactions are in hierarchy of dominance; predominant interaction(s) drive LLPS, and the properties of droplets are modulated by other interactions. There may be a cooperativity between interactions, making modeling and analysis difficult. Thus far, there are no models that can precisely predict the phase behavior of proteins with IDRs or principles that can design peptides with intended phase behavior. Developing such models or principles is important toward understanding and manipulating biologic LLPS.
5. Characterization of Liquid Droplets
Analysis of the rheologic properties is important for characterizing biomolecular condensates, because liquidity is key to the function of biologic LLPS. The appearance and behavior of large protein complexes formed by a mechanism other than LLPS, such as conventional protein-protein binding are often very similar to those of biomolecular condensates.80 Distinguishing biomolecular condensates from non-phase separated complexes and characterizing their properties as liquid are fundamental tasks in characterizing biomolecular condensates.
The simplest approach is the microscopic observation of putative droplets.6,12,38,39 Circularity and the tendency to undergo fusion and fission,11–13,32 reflecting surface tension characteristic to liquid,34,57 are good indicators of liquidity.
Fluorescence recovery after photobleaching (FRAP) is a popular method for quantitatively evaluating the fluidity of droplets in cells.2,12,38,67 FRAP shows the fluorescence signal recovery after photobleaching in a specific in-cell area of interest. Fluorescence recovery can take place through molecular diffusion, dissociation-binding, or any kinds of turnover processes in the target area. Turnover observed by FRAP is not necessarily a consequence of fluidity, and therefore, data obtained by FRAP should be cautiously interpreted.
Dissolution of droplets by treatment with certain small molecules that can disrupt hydrophobic interactions, such as 1,6-hexanediol, is sometimes used.81–83 The drawback of this method is the reported unexpected cellular responses caused by the treatment.84
For collection of data that are difficult to obtain in in-cell experiments, such as phase behavior of the protein depending on concentration or temperature, reconstruction and characterization of the system in vitro are important techniques.12,27,39,83 The cell environments, such as the concentration and composition of salts, small molecules, and nucleic acids, degree of molecular crowding, and temperature, greatly affect the process of LLPS, therefore, conditions in vitro should be carefully considered.34 The additions of salts12,27,39,83 or crowding reagents such as dextran45,85 or polyethylene glycol27,86 is a popular technique to fill in the gap between in-cell environment and that in vitro. The properties of droplets reconstructed in vitro, however, tend to differ from those in cells. For example, the partitioning coefficient of droplets tends to be smaller in cells than in vitro.23,40
Typically, some of these methods are combined to ensure legitimacy.12,38,39,87 Because strict definitions or criteria of LLPS are not yet established, it remains controversial whether the objects of concern are formed by LLPS in some cases. For example, the above-mentioned characterization methods led to a conclusion that the heterochromatin region at chromocenters is formed by LLPS,11,88 but Erdel et al. insisted that chromocenters of mouse adopt a digital compaction state, which is a consequence of coil-globule transition.89 Therefore, standardized methods or tools for the detection of LLPS in cells such as new imaging techniques or chemical probes of LLPS, are in high demand.
6. Chromatin Undergoing Phase Separation
Thus far, many classes of proteins and nucleic acids are reported to undergo LLPS, but we focus on phase separation of chromatin in the sections below because chromatin and related proteins comprise a major class of phase-separating biomolecules. Furthermore, chromatin LLPS likely have great consequences on the regulation and integrity of various biologic processes, as they can directly affect numerous processes of central dogma.2,82
Gibson et al. showed that non-modified nucleosome arrays (NAs) undergo LLPS in vitro in the presence of a crowding reagent and the physiologic concentration of salts, or in cells when microinjected into nuclei.38 Removal of histone tails, which are disordered regions of histones enriched with positively charged amino acids, or mutation of basic patches, which are areas of a histone body comprising positively charged amino acids, results in deprivation of the droplet formation ability. These findings suggest that the phase separation is caused by interactions involving histone tails and basic patches, e.g., histone tail-DNA-basic patch interactions. The droplet formation depended on the number of nucleosomes on NA, i.e., the valency of the NA. Whereas 12-mer showed phase separating capacity, 6-mer showed weakened interactions and 4-mer failed to form droplets. The droplets dissolved when histone tails were acetylated by the recruitment of histone acetyl transferase P300. This result further confirmed that the positive charge on histone tails is important in droplet formation and raises the possibility that chromatin LLPS can be controlled via epigenetic modifications. Furthermore, addition of a 5-mer of bromodomain (Bromo5), which is a binder to acetyl lysine, to the solution of acetylated NA results in the re-formation of droplets. Intriguingly, the droplets formed by non-modified NA and the droplets formed by complexes of acetylated NA and Bromo5, contact each other, but hardly merge (Figure 4).

Droplets formed by complexes of acetylated NA (AF488 chromatin, green) and Bromo5 and droplets formed by non-acetylated NA (AF594 chromatin, magenta) contacted each other, but did not merge. Reprinted with permission from Elsevier.
As mentioned in section 2, H3K9 trimethylated NA also forms LLPS in the presence of other proteins.12 Acetylation and H3K9 trimethylation of histones is implicated in transcriptional activation90 and repression,91 respectively. Therefore, Gibson’s report38 and Wang’s report12 provide hints for a new molecular basis of gene regulation; marking histones at a specific gene locus by PTMs could induce locus-specific LLPS, which may lead to locus specific biomolecular inclusion/exclusion and promotion/suppression of biologic responses, such as transcription. Recently proposed super enhancers (SEs) may be relevant to this note. SEs comprise a cluster of enhancers that are occupied by a high density of transcriptional factors and coactivators, including bromodomain-containing protein 4 (BRD4) and mediator complex, existing in LLPS states.16 SEs are often formed at genes that maintain cell identity92 or at highly expressed genes.16
7. Chromatin LLPS Implicated in Cancers
Cancer cells often inappropriately express regulator genes to excessively produce oncogenic proteins or suppress tumor suppressor genes to maintain malignancy. Aberrant recruitment of promoter regions of regulator genes to transcriptional condensates, which can strongly activate transcription of those genes, is a major mechanism underlying cancer etiology.
c-MYC is a major transcription factor, called a master regulator, and regulates up to 15% of the whole human genome.93 c-MYC upregulates many genes, including the genes implicated in cell proliferation94 or cell growth, thereby maintaining cancer cell identity when overexpressed in cancer cells.95 c-MYC is overexpressed in many cancers,96 and the formation of SE at the c-MYC promoter region has been confirmed.97,98
Chromosomal translocation that generates a mutant DNA-binding protein fused with an irrelevant IDR, is another known mechanism of LLPS-related tumorigenesis. An aberrant fusion protein binds to target DNA loci and inappropriately forms a transcriptional condensate at the DNA locus. For example, in major cases of Ewing’s sarcoma, EWSR1/FLI1 fusion protein is generated by chromosomal translocation.99–101 The fusion protein contains IDR derived from Ewing sarcoma breakpoint region1 (EWSR1), which interacts with nucleosome remodeling complex BRG1/BRM-Associated Factor (BAF)102 and Pol II.103 It also has the DNA-binding domain derived from friend leukemia integration 1 transcription factor (FLI1), which recognizes a GGAA repeat.104 The EWSR1/FLI1 fusion protein functions as an aberrant regulator that form droplets at the GGAA repeat loci82,102 and alters target gene expression in a IDR-IDR interaction dependent manner.82,102 Interactions between IDRs are essential in condensate formation, transcription alteration, and formation of a cell colony,82 which shows cancerous transformation capacity. This kind of chromosomal translocation is also found in human myxoid liposarcoma, where fusion protein of DNA damage-inducible transcript 3 (DDIT3) with FUS or EWSR1 (FUS/DDIT3 or EWSR1/DDIT3) is generated.105 Thus, specific disruption of IDR-IDR interactions may be a new strategy for tumor suppression.
8. Accumulation of Small Molecules to Biomolecular Condensates
Small molecules accumulating in a specific type of biomolecular condensate are an attractive target in drug discovery and chemical biology. Recently, Klein et al. demonstrated that several drug molecules accumulate in biomolecular condensates.27 They found that cisplatin (antineoplastic drug), mitoxantrone (antineoplastic drug), FLTX1 (fluorescent tamoxifen derivative), THZ1 (CDK7 inhibitor), and JQ1 (bromodomain inhibitor) accumulated in purified MED1 droplets or in whole mediator complex droplets in vitro (Figure 5A). As mentioned in section 6, a mediator complex is a component of SEs. Mitoxantrone also accumulated in fibrillarin 1 (FIB1) and nucleophosmin 1 (NPM1) droplets. Both FIB1 and NPM1 are components of the cell nucleolus, and their finding is consistent with the notion that mitoxantrone accumulates to the cell nucleolus. JQ1 also accumulates to BRD4 condensate, in agreement with its binding affinity to BRD4. It should be noted that binding to protein pockets is not required for accumulation; none of the above-mentioned drugs is a known ligand of the proteins, except for the JQ1-BRD4 pair.

Analysis on the structural requirement for accumulation in MED1 droplets. A) Structures of drug and drug lead molecules studied for accumulation in biomolecular condensates. B) Variable structures of the BODIPYs that showed top accumulation ability and bottom accumulation ability. Reprinted in part with permission from AAAS.
Klein et al. analyzed the structural requirements for accumulation in MED1 droplets, by screening the library of boron-dipyrromethene (BODIPY) derivatives (Figure 5B). Though no statistical analysis was provided, structures of top-hit ligands suggest that the presence of an additional aromatic ring is important for the accumulation of BODIPY derivatives in MED1 droplets. They also investigated the structural requirements of the protein, and showed that substitution of aromatic residues, which are abundant in MED1 IDR, by Ala attenuated the ability of the top-hit BODIPY derivatives and cisplatin to accumulate in the droplets, while maintaining the ability of the protein to phase separate.
As a physiologically relevant phenomenon, platination of DNA proceeded selectively in the droplets both in vitro and in cells likely due to the high concentration of cisplatin accumulated in the droplets. Furthermore, DNA platination induced dissolution of the condensates at specific genomic loci: in HCT116 human colon cancer cells, which is known to have SE at the c-MYC locus,98 cisplatin accumulated in the SE, where a high concentration of MED1 exists. Disruption of the SE almost abolished the MED1 contact. This finding implies that cisplatin exhibits or enhances its antineoplastic activity through DNA platination-induced disruption of SE.
MED1 also forms SEs at the MYC oncogene in an estrogen-bound estrogen receptor α (ERα) dependent manner. The addition of estrogen to ERα-expressing cells promoted the formation of MED1 condensates at the MYC oncogene, whereas the condensates were not observed without estrogen or with tamoxifen, an estrogen antagonist. This result suggests that the accumulation of tamoxifen in MED1 droplets enhances the efficacy of tamoxifen.
Overexpression of MED1 induces tamoxifen resistance in breast cancer, but the mechanism has been unknown. Klein et al. proposed a possible mechanism based on their findings: overexpression of MED1 led to expansion of the droplet volume as a consequence of size buffering effect. Accordingly, in the concentration of tamoxifen in the droplets was decreased, leading to reduced efficacy.
These results suggest that accumulation of a drug in certain droplets can enhance drug efficacy. Alternatively, malignant cells may acquire drug resistance by utilizing functions of LLPS. This research postulates a new guideline for drug development; small molecule drugs could be accumulated in a specific droplet depending on their physicochemical properties. Such research directions may also significantly affect the fundamental understanding of molecular mechanisms governing biologic LLPS.
9. Conclusion and Outlook
Various unique regulatory mechanisms for important cellular processes are governed by LLPS. The LLPS paradigm provides new perceptions of bio- and small molecules, PTMs, and comprehensive biologic processes. LLPS in cells is a new principle in biology and medicine, that may completely alter the current view of cellular systems. Furthermore, LLPS provides new opportunities for intervention into cellular processes at molecular levels (Figure 6). For example, drug discovery targeting or exploiting LLPS will be an especially appealing new modality to treat refractory diseases, such as neurodegenerative diseases or cancers. The field remains immature, however, and it should be carefully considered whether an observed phenomenon is driven by LLPS and if it is physiologically relevant through processes that require thorough biologic assay and probing from the standpoint of molecular chemistry. In this respect, understanding the biology of LLPS from a chemical viewpoint, development of new chemical tools and analytical methods to manipulate and dissect LLPS will be critical to advance the field.

Acknowledgment
This work is supported by MEXT/JSPS KAKENHI, JP20H00489 and JP19KK0179 (M.K.).
References

Yugo R. Kamimura
Yugo R. Kamimura received his Bachelor of pharmaceutical science degree from The University of Tokyo in 2019 under the supervision of Professor Motomu Kanai. He is currently a master’s degree student at Graduate School of Pharmaceutical Science, The University of Tokyo, under the supervision of Professor Motomu Kanai. His current research interest is liquid-liquid phase separation of chromatin and its chemistry.

Motomu Kanai
Motomu Kanai graduated from The University of Tokyo in 1989. In the middle of his graduate course, he was accepted as an assistant professor at Osaka University in 1992. He received his PhD from Osaka University in 1995. He worked as a postdoctoral fellow at University of Wisconsin, USA, during 1996–1997. Then, he was appointed as an assistant professor at The University of Tokyo in 1997. After advancing to lecturer and associate professor, he became a full professor in 2010. His research interest is catalysis development linking molecular synthesis and life science.