-
PDF
- Split View
-
Views
-
Cite
Cite
Daniel Keri, Matt Walker, Isha Singh, Kyle Nishikawa, Fernando Garces, Next generation of multispecific antibody engineering, Antibody Therapeutics, Volume 7, Issue 1, January 2024, Pages 37–52, https://doi.org/10.1093/abt/tbad027
- Share Icon Share
Abstract
Multispecific antibodies recognize two or more epitopes located on the same or distinct targets. This added capability through protein design allows these man-made molecules to address unmet medical needs that are no longer possible with single targeting such as with monoclonal antibodies or cytokines alone. However, the approach to the development of these multispecific molecules has been met with numerous road bumps, which suggests that a new workflow for multispecific molecules is required. The investigation of the molecular basis that mediates the successful assembly of the building blocks into non-native quaternary structures will lead to the writing of a playbook for multispecifics. This is a must do if we are to design workflows that we can control and in turn predict success. Here, we reflect on the current state-of-the-art of therapeutic biologics and look at the building blocks, in terms of proteins, and tools that can be used to build the foundations of such a next-generation workflow.
INTRODUCTION
The exceptional advancements in the fields of molecular and structural biology, together with a better understanding of the immune system and its mechanisms for fighting infections and diseases, provide us with opportunities to manipulate immune cells to improve and extend human life [1]. The endeavor of drug discovery has seen significant changes since its modern inception in the late nineteenth century both in terms of approach and drug modalities, catalyzed by a series of key technological breakthroughs. The establishment of X-ray crystallography in the 1950s enabled the atomic visualization of proteins of interest leading to structure-based drug design in the decades to come. The discovery of recombinant DNA and hybridoma technologies in the 1970s made possible the first recombinant protein therapeutics such as Humulin [2] from Eli Lilly/Genentech, Epogen from Amgen [3] and Intron-A from Biogen; and the first monoclonal antibody (mAb) therapeutic, Orthoclone OKT3 [4, 5]. These technologies kick-started a new drug modality—biologics—with ~160 protein therapeutics approved for clinical use to date [6] (Fig. 1). Biologics like (but not exclusive to) immunoglobulins such as monoclonal antibodies (mAbs) are currently the drug modality most pursued in therapeutic development [7]. However, limitations around single targeting using mAbs and rising unmet medical needs inspired researchers to go further and conceptualize molecules with multispecificity [8, 9]. A defining feature of multispecific antibodies (MsAbs) is the ability to recognize two or more epitopes located on the same or distinct targets. This multiple-recognition capability expands the functionality of conventional mAbs, allowing for diverse applications, such as recruiting immune cells to destroy tumor cells or crosslinking distinct cell surface proteins [8–10].

Chronological representation of milestones in technology and drug discovery. From left to right: chemical synthesis of small molecules (aspirin pictured); X-ray crystallography; recombinant DNA technology; hybridoma technology; first interferon as a biologic; first approved monoclonal antibody therapy; computational protein design; cloud computing; machine learning; first approved bispecific; and a future of possibilities.
Although the first MsAbs like catumaxomab, blinatumomab and emicizumab started as a simple composite of Ab fragments extracted from Abs with distinct specificities/epitopes [11–14], the fusion of non-Ab proteins (i.e. cytokines and even protein endogenous ligands (41BB)) to Ab domains was observed to yield arrays of complex protein chimeras with great therapeutic promise but unpredictable behavior [15–17]. MsAbs often vary in size, configuration, valencies, flexibility and angle of approach of their binding modules, as well as in their developability, distribution and pharmacokinetic attributes [8, 9]. These new molecular entities (NME) quickly exposed the limitations of drug development and its platforms with only seven MsAbs approved thus far, despite ~300 molecules that have been reported to be currently in clinical development [6]. Although much of its failure can be attributed to suboptimal target identification and/or validation, the poor drug-like properties of the clinical candidates also needs to be addressed [18]. An emergent generation of computational protein design tools integrated with more rationally based workflows may prove key in the design of drug development workflows that we can control and that we can predict will be successful [19–21].
Much of cutting-edge technologies from the tech sector, such as machine learning (ML) and artificial intelligence (AI), is spilling over into biotech with several large biopharmaceutical companies making notable investments in this space (Rathore, cell, 2022). For example, Generate Biomedicine and Amgen are partnering to use ML to overcome the challenges around the so-called undruggable targets. Over the past 2 years, the potential for ML to transform drug discovery has been demonstrated with the use of tools like Alphafold2 and Rosettafold in predicting protein structures [22–27]. Further progress will depend on the availability and collection of large sets of data. In this review, we will examine the attributes of a select group of protein building blocks (BB) and how they can be exploited in the furtherance of engineering MsAbs with predictable outcomes. We will then examine the latest, state-of-the-art in structural and computational-guided protein engineering, and how to successfully deploy such tools to enable the design of the next generation of MsAbs.
BUILDING BLOCKS
MsAbs provide important advantages over traditional antibodies, including their ability to be engineered into powerful molecules targeting multiple proteins with robust biochemical properties and novel functionalities. Nature’s existing proteins capable of binding, like Abs and cytokines, provide valuable BBs (Fig. 2), whose sequences and folding were optimized over millions of years [28–30]. Similarly, for the development of nature-like MsAbs, the choice of suitable BBs and their compatibility when interconnected, either by genetic fusion via linkers or covalent bonds in the case of heterodimerization of protein domains (i.e. chain pairing), are key considerations when planning the design and generation of these highly functional, multivalent molecules [31–35]. When combined correctly, many BBs can generate a plethora of MsAbs’ formats. Nature-made BBs include fragment antigen binding (Fab), single-chain fragment variable (scFv), fragment crystallizable (Fc), single domain antibodies (VHH) and cytokines (Fig. 3) [31, 36–41]. In contrast, man-made BBs, like the miniproteins and de novo proteins, are also possible with next generation of computational tools like Rosetta [35] (Fig. 2). Below, we take a careful look at the properties of each BB and consider advantages and disadvantages that each class of BBs bring to the assembly of MsAbs in the context of the therapeutic product profile (TPP).

Building blocks as part of the multispecific Abs in the clinic. Using Cortellis (https://www.cortellis.com/intelligence/home.do), we have identified ~300 MsAbs (molecules that recognize 2 or more epitopes located in the same protein target or not) that were taking part in phase I to III of clinical trials as of December 2022. Because in this review we focus on those MsAbs whose quaternary structure is assembled via genetic fusion or covalent bonds, we have discarded all antibody–drug conjugates. Moreover, for 64 out of the 300 entries, the quaternary structure or the BB composition was not disclosed rendering a total of 236 MsAbs the final curated sample from which we proceeded to identify and quantify each BB that composes each MsAb. For simplicity, we do not account for BB’s valency against the same or distinct epitopes. Therefore, a Hetero-IgG molecule will count just as one fab for this class of BB. Similarly, in the case of an IgG-scFv molecule, despite displaying two copies of the same fab, we also count one fab as well. After careful analysis of the information publicly available, we have determined that of the 236 molecules, 197 (83%) contain the Fc region, 137 (58%) contain fab, 85 (38%) contain scFv, 34 (14%) contain cytokines, 25 (10%) contain VHHs, 4 (2%) contain mini proteins, 25 contain “others” and zero for the de novo as BBs. In “others”, we have decided to capture nature-made protein domains like 41BBL and protein toxins.

Schematic representation of selected BBs. Protein BBs can be genetically fused to each other rendering a vast array of new molecule entities with functions not seen in nature-made biologics.
Fab, the most used building block
A simple and minimalistic approach to designing MsAbs is to use the Fab moiety of the IgG. Fabs are composed of the VH and CH1 domains of the HC and the whole LC (VL and CL) [42, 43], which create a very stable heterodimeric interface wherein approximately 100 residues are involved in multiple stabilizing contacts including hydrogen bonds, salt bridges, hydrophobic interactions and van der Waals [43]. Moreover, and in great contrast with the scFv (see below), the Fab surface is ready for solvent exposure because its extraction from IgGs only requires a truncation at the flexible linker connecting CH1 to the Ab hinge. Thus, the stable protein core and the vast hydrophilic surface make Fabs a popular choice for assembly of MsAbs, especially for drugs formulated in a high concentration and/or administrated subcutaneously (SC). However, BBs composed of two distinct polypeptide chains require a heterodimeric assembly, which is an important consideration for MsAbs containing two or more distinct Fabs (i.e. Hetero-IgG) since correct cognate pairing of each HC and LC is necessary (Fig. 2). Since the single-cell expression of molecules like Hetero-IgGs can yield up to 10 different mispaired species [32], many engineering strategies have been envisioned to enforce correct chain pairing (charge pairing mutations (CPMs), knob-into-hole (KiH), single-chain Fabs (scFabs), among others) [43–46]. scFabs retain the biophysical properties of Fabs, unlike scFvs, and have demonstrated impressive results recently [47]. Nonetheless, the sequence variation in the VH/VL interface caused by the CDRs makes these chain-pairing platform approaches less successful [32, 48]. Another method involves using common LCs (cLCs). These cLCs can pair with two or more distinct HCs with little to no disruption of binding, driven by the non-cognate HC, and can be discovered in vivo by using a cLC transgenic mouse [49], display [50] or by simply testing whether a given LC, in addition to its cognate HC, can also pair with the non-cognate HC for expression and binding [32].
According to Cortellis, ~58% of the ~300 MsAbs currently in clinical trials contain at least one Fab as BB (Fig. 3). Moreover, one of the first MsAbs receiving approval from the Food and Drug Administration (FDA) and the European Medicines Agency was emicizumab for the treatment of hemophilia, which contains two Fabs that share a common LC.
ScFvs, the most tempting building block with hidden liabilities
ScFvs are molecules generated by connecting the variable regions of heavy (VH) and light (VL) chains via an often-flexible linker [51]. Several Fv linkers have been successfully brought to the clinic. The most common is the (Gly4Ser)3 [14], but the linker sequence can be customized to be more rigid or even electrostatically charged [52]. In addition, the VH–VL or VL–VH orientations in the scFv molecule affects the binding and the developability properties of this BB [53]. This makes the scFv an attractive BB for making MsAbs since a single polypeptide chain avoids the complexities of chain pairing (see Fabs above) and are approximately half the size of a Fab (~25 kDa) (Fig. 2). Moreover, they can be fused in tandem using a variety of linkers to the N and C terminus of the Fc and/or linked in a string varying from two to four copies [8, 54]. However, since this BB lacks the CH1 and CL, the constant region of the Fab, the native disulfide bond at the C terminus of the CH1–CL interface is also absent. Although the protein linker can help overcome the inherent instability of Fv during purification and storage, the thermostability is lower on average than that of parent Fab due to the dynamic VH/VL interface [51, 53, 55]. To remediate this, disulfide bonds have been successfully deployed to strengthen this interface and increase the thermostability. However, when platform disulfide solutions are deployed (i.e. VH40–VL100), many Fabs fail to retain function upon conversion to scFv [53, 56, 57]. Tailored disulfide bonds that improve stability can be identified by computational protein design [31, 58]. The generation of scFvs leads to the truncation of the CH1/CL region, which exposes ~25% of the Fv surface to the solvent (Fig. 2). Consequently, this region on the opposite side to the paratope of the Fv becomes a critical area with a propensity to aggregate [53]. Because aggregation is often correlated with molecular concentration, MsAbs containing scFvs can display poor monodispersity when subjected to Ab-like concentration, therefore limiting the TPP. Generally, the appeal of a simple platform technology coupled with liabilities that are only evident later in the development cycle has resulted in many scFv-containing MsAbs advancing to the clinic. Consequently, ~36% of the ~300 MsAbs currently in clinical development contain at least one scFv as one of the BB (Fig. 3), with blinatumomab, a two scFvs in tandem molecule from Amgen, receiving FDA approval in 2014. To overcome scFv liabilities as well as chain-pairing complexities, the field has been moving toward VHHs.
VHH: A newly approved building block that promises to replace scFvs
In addition to IgGs, camelids and sharks naturally produce heavy-chain-only antibodies where the antigen-binding region consists of a single variable domain of the heavy chain (VHH) [59] (Fig. 2). Moreover, VHHs can also be generated in transgenic mouse models that express humanized VHH [60, 61] and by display [62]. These VHHs, also called nanobodies or single domain antibodies (sdAbs), offer great advantages as BBs in the assembly of MsAbs, bypassing the need for cognate HC/LC pairing as in the case of Fabs and avoiding the deployment of interconnecting linkers such as in the case of scFvs [59, 63]. Moreover, due to its compact structure of ~12–15 kDa (Fig. 2), VHHs are likely to display desirable developability attributes including high yield, low aggregation upon stress (2W@4deg) at high protein concentration, high thermostability and low viscosity [64, 65]. As result of their optimal biophysical properties, VHHs can also be safely delivered by inhalation since their small size ensures short half-life in the systemic circulation [66]. The binding affinity for VHH can be as high as picomolar, matching that of the best Fab or scFv molecules even without the VL paratope [63]. Although there were initial safety concerns regarding the non-human framework of the VHHs, these were quickly mitigated with the approval of the caplacizumab in 2019 by the FDA. Since then, 25 VHH-containing MsAbs (~10%) have entered clinical development (Fig. 3), suggesting that the path for VHHs in the clinic is now clear.
Fc, the multifunctional building block
The fc in antibody engineering
The role of Fc as a key BB for biologics has long been recognized with the first Fc-CD4 fusion developed over three decades ago to inhibit HIV infection [67, 68]. In addition, MsAbs greatly benefits from the ability of the Fc to heterodimerize upon deployment of chain-pairing technologies (Fig. 2). Two popular approaches for Fc heterodimerizing are the use of knob-in-hole (KiH) mutations and CPMs [46, 69]. KiH mutations introduce a sterically bulky amino acid, like tryptophan or tyrosine on one chain and small amino acids like glycine or alanine on the other, to create a protrusion and pocket, respectively. While this approach is particularly efficient to reduce knob-knob homodimers, the hole-hole homodimer can still form and must be separated from the desired heterodimer through additional purification steps. In contrast, CPMs introduce positively charged residues on one chain and negatively charged residues on the other to favor heterodimer formation. Here, chains with the same charge are expected to repel where opposite charges favor heterodimer formation. This makes the Fc an engine for the generation of molecules where asymmetric formats are required since distinct BBs can be attached to both the N and C termini of each half of the Fc [8, 9, 70] (Fig. 2). This BB is also known for displaying an optimal developability profile, which provides Fc-containing MsAbs with higher expression yields, ease of purification via Protein A capture and high solubility [71]. Moreover, the Fc can recognize several receptors on the surface of a variety of cells, which can be manipulated to improve TI (Fig. 2).
Not surprisingly, around 83% of the ~300 MsAbs currently in clinical trials contain an Fc (Fig. 3).
Fc biology
The Fc interacts with multiple effector proteins that alter immune response or prolong antibody half-life in vivo [72]. For example, the IgG Fc domain can interact with proteins from the Fc gamma receptor family (FcγR) to elicit effector cell responses that result in antibody-dependent cell-mediated cytotoxicity or antibody-dependent cell-mediated phagocytosis [73]. The neonatal Fc receptor (FcRn) is, among other functions, responsible for the prolonged half-life of Fc-containing MsAbs as it is a crucial part of an endosomal recycling process [74]. C1q also binds to the Fc and is responsible for activating the complement system culminating in complement-dependent cytotoxicity via formation of the membrane attack complex [75]. Thus, each interaction in Fc-containing therapeutics must be carefully modulated to prevent unnecessary immune response.
FcyR
The FcγR family modulates cytotoxicity and phagocytosis through intracellular signaling events in response to antibody binding. These proteins consist of FcγRI, FcγRIIa, FcγRIIb, FcγRIIc, FcγRIIIa and FcγRIIIb [76, 77]. Each FcγR is classified as either activating or inhibiting, based on the intracellular signaling motif. FcγRI, FcγRIIa, FcγRIIc and FcγRIIIa are considered activated in humans and contain the immunoreceptor tyrosine activating motif (ITAM) in their intracellular domains [78–80]. The receptors, except for FcγRI, have low affinity for the Fc and interact best with IgG complexes. The resulting clustering of the ITAM motifs lead to a signal cascade involving SRC kinase, SYK kinases, Rho GTPases and actin polymerases, resulting in phagocytosis [80–86]. MAP kinases, the RAS pathway and MEK are also activated through this signaling cascade, leading to the expression of cytokines [87–89]. The result of the signaling cascade is cell-type specific and dependent on the types of receptors expressed on each cell. In contrast, FcγRIIb is the only inhibitory family member that contains an immunoreceptor tyrosine inhibitory motif (ITIM) [90–92]. Interactions between ITIMs and ITAMs result in the recruitment and activation of SHIP-1 and SHIP-2, leading to depletion of IP3 precursors and preventing downstream signaling [93]. Thus, the Fc interaction with the FcγRs presents us with a unique opportunity to program Fc-containing MsAbs to recruit and elicit immune cell responses in a localized fashion that can be explored in oncology and inflammation and virology. The P238D mutation in the Fc enhances FcγRIIb binding over FcγRIIa and reportedly increases agonistic activity [94]. In contrast, mutations like LALA developed at Genentech or the SEFL2 from Amgen were designed to completely abrogate Fc binding to all FcγRs [31, 42, 95, 96]. In both cases, LALA and SEFL2 enable therapeutic effects to come solely from the Fab or any other targeting BB of an MsAb.
FcRn
FcRn is a unique Fc receptor because it is not directly involved in signaling cascades that lead to immune responses. It interacts with the Fc as a dimer with beta-2-microglobulin (β2m) and is structurally similar to MHC-1 [97, 98]. FcRn is found intracellularly and functions to prolong IgG half-life by recycling through endosomes [99]. Under weak acidic conditions, FcRn displays high affinity to Fc through the protonation of residues His310 and His435 on the Fc, located at the junction of CH2 and CH3 [100]. Once IgG enters these endosomes, the acidic conditions result in histidine protonation and subsequent FcRn binding [101]. The FcRn–IgG complex is then retained in the endosome while other molecules are transported to the lysosomes for degradation. Once the endosome fuses with the plasma membrane, the pH change causes the dissociation of the FcRn–IgG complex. This mechanism of action is also responsible for the translocation of IgG into multiple tissues, including the placenta.
Often there are benefits to further extending the half-life provided by the Fc, via half-life extension (HLE) mutations. Thus, mutations that increase the Fc affinity to FcRn at low pH can be of clinical relevance because HLE is preferred to reduce dose frequency. Two examples of HLE technologies are the LS (M428L/N434S) mutations from Xencor and YTE (M252Y/S254T/T256E) from Medimmune, which were identified based on point mutation scans of the FcRn–Fc interface [102–104].
C1q
C1q consists of six trimeric globular regions connected by alpha helices to a central bundle [105]. Each of these six globular regions interacts with the CH2-hinge region in the Fc of IgG and IgM antibodies [105–107]. Thus, complement activation depends on the association of C1q with a multimeric IgM or IgG, forming a pentameric or hexameric structure, respectively [108, 109]. These interactions trigger the activation of the complement system, resulting in the formation of the membrane attack complex, leading to complement-dependent cytotoxicity.
Cytokines, the off-the-shelf building blocks that were not made to last
The immune system uses a complex array of cell types carefully controlled by a class of secreted extracellular signaling proteins called cytokines, typically less than 30 kDa with each varying in which cells they stimulate (Fig. 2) [110]. The diversity in function for this class of molecules represents an opportunity to reengineer the native proteins as effective BBs for the development of cytokine containing MsAbs, which can be deployed to equilibrate out-of-balance inflammatory states for both cancer immunotherapy and many autoinflammatory diseases [39]. For example, because IL2, a 15-kDa four-helical bundle, is a potent proinflammatory cytokine that stimulates interferon gamma production in effector T cells (Teff) and in natural killer cells, it is being explored to treat cancer patients [111], whereas IL-4, homologous in structure to IL-2, suppresses the immune system by inducing differentiation of naïve helper T cells with the potential to be applied as an anti-inflammatory [112]. With over 100 different cytokines identified to date, each with a unique immunological niche they represent, at first glance, a library of off-the-shelf BBs is ready for therapeutic use [113, 114]. Not surprisingly, around 14% of the ~300 MsAbs currently in clinical trials carry a cytokine (Fig. 3). However, the first cytokines evaluated in the clinic were wildtype (WT) (i.e. IL2 and IL12), which demonstrated narrow therapeutic windows, mostly due to the high toxicity profiles [115, 116]. This suggests that to deliver the full potential of these cytokines, the toxicity profiles must be improved. Such can be attempted by engineering this BB itself or by smart combination with other BBs, or both [114].
Developability of the cytokines
Many cytokines when recombinantly expressed exhibit liabilities such as aggregation, which in turn can affect engineering efforts, developability and potential immunogenicity [117, 118]. This may be a consequence from most cytokines being composed by helical bundle motifs and displaying multiple binding interfaces, which generally classifies the native form of cytokines as rich in hydrophobic patches, which can lead to aggregation [119]. Another example of a liability is the case of free cysteines. The IL-1 family, a group of 11 cytokines with a beta-trefoil fold (i.e. IL-1, IL-18, IL-36), displays multiple free cysteines making them prone to aggregation via non-canonical disulfide bonds and/or to reduced potency in case exposed to oxidative environments [120, 121]. Thus, novel C-to-S substitutions have been introduced to mitigate these issues [122]. For example, most IL-2 molecules in the clinic contain a C125S mutation that is critical to avoid aggregation during development [123–125]. In other cases, these cysteines are involved in forming intermolecular disulfide bonds to ensure the assembly and stability of functional dimers (i.e. p35 and p40 subunits of the IL12 heterodimer) [126]. With many of these issues impairing developability of the cytokines, it raises the question as to whether in some cases the endogenous cytokine should be replaced altogether with more developable BBs capable of mimicking the native signaling.
For example, Synthekine has developed a unique surrogate cytokine agonist platform that uses a combinatorial approach to mimic cytokine–receptor interfaces via linked VHHs among other BBs [127]. These new configurations of molecules with VHHs tethered by flexible linkers allows for receptor activation as well as novel signaling outputs not observed with the native cytokines.
Modulating cytokine cell-type selectivity
Cytokine receptors are expressed at varying levels on different cell types providing an opportunity to tune cytokine receptor affinity to bias cell types that, when stimulated, provide a favorable outcome for a disease of interest. Cytokines with several mutations relative to the WT sequence called “muteins” have become a powerful tool where each binding interface was carefully manipulated to tune affinity via the introduction of point mutations [128]. For example, in the case of IL2, attenuating binding to the subunit IL2Rα reduced activation of immunosuppressive Treg cells leading to higher T-effector activation and, thus, widening the TI by requiring lower doses [129]. While much of this effort has been achieved by structural analysis of the cytokine/receptor interface as to inform point mutations, others have deployed directed evolution by yeast display [130]. An example of the latter is the work by Garcia and colleagues where through a series of directed evolution by yeast display produced an enhanced version of IL2 known as “superkine”. This molecule, while, like many other IL2 muteins, showing reduced binding to IL2Rα, also exhibits binding to IL2Rβ with higher affinity than WT [131].
Neoleukin Therapeutics has pioneered a novel strategy focused on developing better than nature-made cytokines by designing de novo cytokines with customized function and enhanced developability profile. In this case, Neo-2/15, an IL2 cytokine mimetic, was designed using Rosetta to graft a de novo helix onto the native IL2 backbone, completely abrogating binding to IL2Rα while building additional stability onto this helical bundle protein [132]. This case study demonstrates the potential for applying computational tools to the design of de novo cytokines that improve upon nature.
Targeted cytokines
Toxicity from systemic cytokine signaling has motivated the coupling of these BBs with other BBs (targeting arms) with the purpose of reducing peripheral cytokine activity. These BBs fused with cytokines attempt to localize the cytokine to a tissue of interest (i.e. tumor microenvironment) and reduce peripheral activity by recognizing a selected tumor associate antigen (TAA) [133]. However, TAA selection can be challenging due to patient diversity, antigen copy-number, expression levels in different tissues and internalization rates [134]. Amgen’s tumor-targeted IL-21 muteins demonstrated success when fused to Abs that target PD-1, which is highly expressed on activated T cells known to populate the tumor microenvironment [15]. Roche has demonstrated several successful examples of TAAs fused to IL2 muteins including PD-1, tumor antigen carcinoembryonic antigen (CEA) and fibroblast activation protein-α, a protein involved in extracellular matrix remodeling and highly expressed in tumor microenvironments [16, 135, 136]. However, for the targeting approach to produce its effect, it must drive the sequence of binding events. Thus, it is key to ensure that binding affinity of the BB(s) that recognize the TAA(s) is significantly higher than the cytokine BB. This often leads to the use of cytokine muteins in combination to the targeting arm technology as muteins have, by definition, attenuated binding affinity. Nonetheless, the selection of the best mutein BB and target BB combination needs to be holistically considered for the generation of clinical candidates that meet TPP requirements.
Conditional activation of cytokines
Conditional activation describes the protein design strategy that attempts to reduce systemic activity by engineering in a dependency on a protein biosensor responding to a tissue-specific cue (i.e. protease cleavage, target reconstitution, pH change, small molecule induced allostery). For example, “masking” a proinflammatory cytokine activity by fusing a peptide or even its receptor to this BB, it has been shown to reduce cytokine signaling altogether. Once these molecules reach the target tissue, the mask is released, often by cleavage of a linker that encodes for a tumor specific protease, and the cytokine activity is fully restored [137]. A case study is the IL2 mutein fused to a heterodimeric Fc also carrying the IL2BR fused on the adjacent N terminus in the Fc via a metalloproteinase (MMP) cleavable linker [138]. Interestingly, their initial approach using Fc homodimers with the same IL2 and IL2RB fused to the N terminus of each Fc chain exhibited no anti-tumor efficacy and could not be digested by MMPs, highlighting the complexity of the quaternary structures generated in such cases.
Targeted reconstitution is another form of conditional activation where, rather than activation by a tissue-specific cue, a cytokine is split into separately dosed subunits that only form an active molecule when brought to be assembled on target tissue. This was first demonstrated by Neri and colleagues with a split IL12 therapy where p35 and p40 heterodimeric subunits were separately fused to an anti-tumor necrosis factor scFv, with the aim to only assemble the separated subunits when the antibody fusions were bound to cells expressing this specific TAA. The split IL12 showed dependency on the presence of both splits for activity though the overall potency was lower relatively to the endogenous p35–p40 complex [139]. More recently, a de novo IL2 mimic monomer, Neo2/15, was split by separating its monomeric 4-helix bundle into 1-helix and 3-helix heterodimers simply by deleting the loop at a helix-loop-helix junction [140]. By fusing each split to either an anti-HER2 or anti-EGFR DARPin, it was demonstrated that the activity for each split was dependent on its other half and improved tumor shrinkage relative to both Neo-2/15 unfused and Neo-2/15 fused to a targeting BB.
Miniproteins
While Ab-based therapeutics have been the predominant biologics pursued, other classes of proteins are also being explored (Fig. 2). Historically, the use of non-Ig, or “alternative”, BBs are often motivated by the desire to overcome the limitations of nature-made BBs such as poor stability, poor tissue penetration, inability to target intracellular proteins or suboptimal drug delivery routes (i.e. intravenous), among others. In addition to offering solutions to these shortcomings, non-Ig BBs, as therapeutics in and of themselves, may offer different pharmacokinetic and pharmacodynamic profiles and can access target binding interfaces that nature-made BBs cannot. Aside from the allure of the potentially superior biological and biophysical properties of alternative BBs, novel scaffolds may also offer additional intellectual property protection for therapeutics. There are excellent reviews summarizing the properties and therapeutic relevance of these molecules [141–144], as well as successful targets to date [145]. In the context of MsAbs, non-Ig BBs offer a diverse set of binding interfaces with varying sizes, surface geometries and biophysical properties (Fig. 3). These differentiating factors have the potential to offer new binders toward hard-to-target molecules and offer new mechanisms of action. Crucially, incorporating miniprotein BBs into MsAbs can be a way of bypassing or simplifying the chain-pairing issue, and build additional targeting arms while mitigating the overall MW increase of the MsAbs.
Generally, these alternative scaffolds are single domains derived from a protein with desired properties. The protein of origin may be human, bacterial, invertebrate or even a de novo design, and range in size from 4-kDa knottins to 20-kDa anticalins (Fig. 2). Particular regions on each of these scaffolds is diversified in libraries, which are then utilized via various display methodologies to discover binders, and subsequently affinity matured against a target of interest. A number of scaffolds are highlighted below to demonstrate their unique features.
Monobodies, pronectins and adnectins
The most similar to an Ig BB are monobodies, pronectins and adnectins. Each contains a |$\mathrm{\beta}$|-sandwich fibronectin type 3 (TNF3) domain composed of seven anti-parallel |$\mathrm{\beta}$|-strands connected by flexible loops. Typically, three of these loops are diversified and used for binding in an analogous manner to antibody CDRs in vitro. Unlike Fabs, libraries have been made to diversify the solvent-exposed anti-parallel |$\mathrm{\beta}$|-sheets as well, rather than the loops exclusively. In addition, flexible loops at opposite ends of the molecule have been engineered for binding. These features enable monobodies to serve as binders to diverse targets as well as potentially facilitating the design of biparatopics. Examples include BMS-986089, an adnectin–Fc fusion, which is currently in phase III clinical trials for spinal muscular atrophy, and SF2-S29, an MsAb that utilizes a monobody to achieve highly specific binding to Fc|$\mathrm{\gamma}$|RII [146, 147].
Anticalins
Anticalins are based on the lipocalin protein family, involved in the extracellular transport of lipophilic cargo. They exhibit a prototypical |$\mathrm{\beta}$|-barrel structure with an internal cavity and four highly flexible loops. The structural similarity of lipocalins to immunoglobulin variable regions inspired the engineering of the flexible loops, via display technologies, toward various therapeutic targets, the first being CTLA4 [148]. Due to the longer loops of the anticalin relative to an antibody’s CDRs, the total binding site surface area exhibited by the former (2380 A2) was significantly higher than typical antibody–antigen binding surface areas (average is 1550 A2) or an a-CTLA4 antibody (1664 A2) binding to the same epitope (Fig. 3). This can potentially result in binders with higher affinity and/or specificity. A notable MsAb example utilizing anticalin binders is PRS-343, a 41BB/HER2-targeting bispecific currently in phase II clinical trials [149].
Affibodies
Affibodies are derivatives of staphylococcal protein A, and one of the few BBs with a bacterial origin. It is a highly stable three-|$\mathrm{\alpha}$|-helical bundle, the helical surfaces of which are re-engineered for binding, again, typically via display technologies. The molecular weight of affibodies are approximately 6.5 kDa, making one of the smallest BBs. Indeed, one of the characterized anti-HER2 affibodies has been shown to bind a unique epitope with pM affinity [150]. This reduced size may enable affibodies to target epitopes that are inaccessible to other, larger BBs. Indeed, a structurally characterized anti-HER2 affibody has been shown to bind a unique epitope with pM affinity [150].
In addition, modulating a MsAbs molecular weight may be one avenue to control the molecule’s pharmacokinetic properties as well as its ability to penetrate solid tumors. An exemplary MsAb incorporating an affibody includes an anti-EGFR affibody–trastuzumab fusion, termed an affimab, which demonstrated superior efficacy compared to trastuzumab [151]. As a bacterial protein, however, affibodies may prove more immunogenic than other BBs.
DARPin
Ankyrin repeat proteins are modular proteins composed of units of two anti-parallel alpha helices with adjacent units connected by a beta hairpin. Each modular unit is composed of 33 residues and the full protein can be composed of up to 29 consecutive repeating units, although the most common length is four to six units. Designed ankyrin repeat proteins (DARPins) were generated using a consensus sequence design process to stabilize individual units and make them more amenable to random mutagenesis and unit shuffling, to unlock their potential as a binding scaffold. These proteins are related to leucine-rich repeat proteins, normally found in the Toll-like receptors of the innate immune system of vertebrates. In contrast to the binding interfaces of other BBs, the DARPin interface is highly rigid and the most concave in shape [152] (total binding surface area over 2300 A2 [153]), making DARPins potentially a more suitable binding scaffold against certain targets. The highly rigid, preorganized binding interface of DARPins likely contributes to the typically high affinity binding observed [154]. The ability of this BB to act as an allosteric inhibitor, as well as its favorable Tm compared to mAbs, has also been attributed to its rigid structure [155–157]. An example of a unique multispecific solely composed of DARPins is MP0250, a single-chain trispecific molecule composed of four DARPin BBs. It is currently in phase II clinical trials against multiple indications.
Knottins
Knottins are short peptide of typically 30 to 50 amino acids in length with a conserved fold highly stabilized by three disulfide bonds. This creates a rigid scaffold, with a number of flexible loops that have been diversified in libraries. Due to their small size, knottins are cleared from the body rapidly. Knottins are the smallest BBs used to generate binders, creating the potential to target otherwise inaccessible epitopes on proteins of interest [158], akin to affibodies. Given their small size, they may be useful in multivalent molecules where high avidity is necessary. To date, however, the vast majority of knottin binders have been used as diagnostic agents due to their short half-lives and have not been used in MsAbs [159]. A limitation of this BB is the heterogeneity seen in disulfiude bond formation in some instances.
De novo miniproteins
De novo protein design uses computational tools guided by predictive algorithms based on biophysical properties and large data sets of predetermined protein structures to generate man-made protein folds or even entire proteins with amino acid sequences not found in nature (Fig. 2) [160, 161]. In this review, de novo miniproteins are defined as BBs where a binding motif or an entire protein is designed in silico and the BB is capable of binding to a specific target. One of the earliest examples for designing de novo miniproteins combines mimicry of a protein interface with a computationally designed scaffold. First, a functionally relevant binding motif (e.g. 2–3 key residues for a binding interaction) is identified by structural insight or experimentally, and second, a computational screen for de novo protein scaffolds of a variety of sizes and topologies (e.g. helix-helix and helix-strand) are optimally grafted onto the binding motif [162]. The success of this approach was largely demonstrated to design neutralizing binders against the conserved region of the stem in the influenza A H1 haemagglutinin (HA) from the H1N1 pandemic virus [163, 164]. In a similar approach, de novo miniproteins were designed to mimic the ACE2 binding motif for the receptor binding domain (RBD) of the SARS-CoV-2, leading to neutralization of this virus by capping the RBD binding sites on the virus inhibiting membrane fusion [165]. In contrast with the HA de novo binders, these RBD binders were designed by designing an ACE2-derived backbone harboring key RBD-ACE2 interacting residues, followed by introducing completely de novo buttressing helices beneath it to mask its exposed hydrophobic underside.
The design of completely computationally design de novo miniproteins with binding interfaces that do not exist in nature has also been recently reported [166]. These purely man-made binding motifs were generated by utilizing rotamer interaction fields that represent comprehensive maps of side-chain rotamers on a targeted interface [166]. In this case, 84 690 scaffolds composed of helix-bundle and helix-strand topologies were docked to fit the billions of possible side-chain interactions per target, generating 14 binders with nanomolar affinity against a diverse set of targets including CD3δ, TGFβ, EGFR and IL-7Ra. Interestingly, a close look at the number of initial hits for each target interface showed that those binders were enriched in the case of target interfaces with greater hydrophobicity.
De novo miniproteins are a novel, yet fast developing class of BBs with potential to accelerate the discovery of functional binders bypassing the lengthy timelines in the case of in vivo and display approaches. However, de novo miniproteins should be viewed as a double-edged sword: a highly dynamic BB with potential for faster timelines, optimized developability, novel function or customized conditional activation [131, 139], but also a BB with inherent risk of immunogenicity by being composed of amino acid sequence with little to no human homology. One of the most in vivo characterized de novo miniproteins, Neo-2/15, mimics human IL2. It demonstrated almost identical IgG production to hIL2 and mIL2 in mice over a 14-day study, which was attributed to small size (~10 kDa) and high stability of the molecule [131]. However, since each de novo miniprotein varies significantly in sequence, immunogenicity risks will vary case to case for each de novo miniprotein.
ASSEMBLY OF THERAPEUTIC FORMATS
Large organizations often require platforms with firmly established processes to conduct drug discovery programs. In the case of small molecules or of monoclonal Abs, screening libraries containing billions of compounds or thousands of Abs in search of a lead candidate are good examples of current workflows. However, the same approach may not prove equally successful when the intent is to develop MsAbs.
Although platforms such as BiTE, XmAb, DART and CrossMAb, among others, have reported initial success generating numerous clinical candidates and even market-approved molecules (i.e., blinatumomab), their limitations as universal platforms become quickly apparent [14, 52, 167, 168]. Initial MsAbs formats were focused on a very specific biological target(s) with a very specific epitope(s). However, the same format/configuration may not show the same performance when applied to different targets and/or epitopes [169], clearly highlighting the unique relationship between the MsAbs’ format/configuration and the epitope recognized by the selected BBs. Single format-driven platforms tend to have a fixed valency like 1:1 or 2:2, which can be a limiting factor when mixed binding valency is required [170]. Then, the pursuit of many highly valuable therapeutic targets requires a departure from the so-called symmetric formats [8]. For example, a crosslinking obligated mechanism of action (MOA) with CD28 requires a monovalent binding approach to avoid activation in the absence of a TAA [171]. In contrast, targets like tumor necrosis factor receptor superfamily member 9 (TNFRSF9) (41BB) and tumor necrosis factor receptor superfamily member 5 (TNFRSF5) (CD40) do not require a monovalent approach to observe the same MOA [7, 172]. Further examples of format/configuration divergency are the trispecifics. In this case, the molecule is designed to recognize three targets (in simultaneous or in an organized fashion) aimed at improving the TI [173–175]. In the T-cell engager field, the deployment of 2xTAAs when linked to CD3 can lead to a better cell target selection than a single TAA (the “and” effect), reducing toxicity while improving the overall TI [173]. In addition, the 2xTAAs trispecific approach can be engineered to expand the patient population when the expression of single-TAAs is not widely expressed (the “or” effect) [7]. Consequently, the assembly of three distinct binders into a single entity with each possible variable valency (1,1,1, 1,1,2, 1:2:1, 2:1:1, etc.) will create a scenario where hundreds of configurations are possible. Moreover, the success in the assembly of trispecific molecules is even more dependent on the selection of the appropriate BBs available due to complexities of the molecular geometries, molecule distribution and half-life, and developability properties are greater with the increase in the number of binding recognitions. Altogether, the reasons above suggest that these exciting modalities need a new concept platform approach.
Recently, an example of rational BB selection as a means to predict the success of MsAbs’ assembly has been reported for the Hetero-IgG format [32]. Here, the authors identify the properties of the BBs, Fabs in this case, that impact the cognate pairing of the HC/LC and lead to the generation of many impurities when two unique HCs and two unique LCs are co-transfected into a single mammalian cell [32]. Although these polypeptide chains are equipped with ingenious chain-pairing technologies like KiH, CPMs, asymmetric disulfide bonds and domain switch, the results are often unsatisfactory and have low recovery yields [31, 45, 46, 69, 168]. However, if the Fabs are selected for their intrinsic preference for cognate pairing and low promiscuity, the expression of the desired molecule is highly favored. The molecular basis for such HC/LC pairing selectivity has been independently demonstrated by others [48]. Another challenge in the generation of MsAbs is the purification steps that follow expression. Often, additional engineering tools like selectively modifying the protein’s affinity to resins to increase selectivity during Fc capture or mutating charge expose residues to create a difference in the isoelectric point (pI) between the targeted species and the misassembled impurities that are deployed to facilitate such steps [52, 176, 177]. Unfortunately, such approaches are largely unsuccessful as they are also dependent on the native properties of the BBs. Moreover, additional engineering to merely facilitate purification needs to be measured to avoid exacerbating the risk of immunogenicity. Thus, the identification of the biophysical and biochemical attributes that characterize the BBs will directly contribute to the design of smart and format-agnostic platforms, which we control and whose success we can predict (Fig. 4). Such platforms are more likely to be MsAbs’ centric with the initial format considerations informing steps such as immunogen design, which will enrich the selection of BBs with the necessary requirements. Once BBs that meet the design goals are identified, efforts can be directed to test format diversity, wherein different structural arrangements and valences will determine the best biology and other TPP attributes. Consequently, because the MsAbs lead candidates were carefully selected to meet platform requirements for expression and purification profiles, the late optimization stage leading to the selection of the final clinical candidate is likely to become predictable (Fig. 4).

Generating multispecific antibodies. Immunogen design creates novel protein fusions that direct the development of antibodies to specific epitopes on the immunogen. These antibodies are generated either in vivo, in vitro or in silico and are characterized via biophysical analysis for properties amenable to the TPP. The multispecific molecule can be generated through incorporation of several different targeting modules fused to a central Fc. The resulting molecules are then expressed at a small scale. Each molecule undergoes functional and developability assessment to ensure the biological effect and TPP are met. Candidate molecules passing these metrics are then sent for CMC development and large-scale expression for clinical trials.
ATTRIBUTES TO BE DISPLAYED BY THE MSABS
The engineering of MsAbs is often solely focused on binding and cell function. However, for this NME to succeed as a therapeutic, many other attributes must be met as well. Indeed, cost of goods, molecule stability, half-life, safety and designated route of administration for patient comfort are equally critical parts of the target candidate profile. Therefore, how can we design and engineer biologics that address those attributes? A holistic approach is necessary. First, as mentioned above, the properties of the BBs selected will invariably translate into the final MsAbs molecules. For example, Fabs with low expression levels, suboptimal molecular stability or purification profiles will often render MsAbs with low yield and low stability [32, 178]. Second, since we design molecules, whose quaternary structures do not exist in nature, our ability to predict half-life and safety profiles, like antibody-drug-antibodies response, is often reduced. In such cases, several arrangements of the same BBs may have to be tested to identify the final MsAb format [179]. However, current limitations with existent cell-based assays and pre-clinical models impair our capacity to predict the MsAbs profile after its administrated into patients.
ACHIEVEMENTS IN COMPUTATIONAL PROTEIN DESIGN AND THE ROAD TO ML-POWERED MSAB DESIGN
Since its inception [180–185], physics-based approaches, best exemplified by the Rosetta software suite, have achieved remarkable feats (Fig. 1). These include design of the first de novo protein fold (TOP7), enzyme [186], membrane protein [187], ion channel [188] and self-assembling nanocages [189–191]. Notable therapeutically relevant successes include the development of proof-of-concept vaccines against human immunodeficiency virus [191, 192], respiratory syncytial virus [193] and hemagglutinin of influenza [163]. In regard to antibody engineering, computational design has been successfully deployed to increase antibody affinity [194], create novel HC–HC and HC–LC chain-pairing technologies, and optimize developability [195, 196]. For example, von Kreudenstein and colleagues applied computational design and molecular dynamic force field energy calculations to drive the heavy chain (HC) pairing of MsAbs, with improved results over previously established experimental methods [46], while also requiring less empirical testing. Workflows have also been developed to model Ab–target complexes of unknown structures, using RosettaAntibody [197–199] and other tools such as ABodyBuilder [200]. Antibody structure prediction and analysis, however, have only become streamlined and democratized in user-friendly tools such as Molecular Operative Environment [201] and Schrödinger [202].
Recent machine learning (ML)-based protein structure prediction tools such as AlphaFold [24] and RosettaFold [203] are having a profound effect on protein design. Whereas homology modeling previously could take weeks and require deep familiarity with the protein of interest, these tools empower users to generate highly accurate models within minutes or hours [24, 204]. Newer tools can do so in mere seconds [205]. ML-based tools have also been created to optimize developability, as well as function [206–208]. Similarly, ML techniques have already been deployed to predict attributes of drugs, including antibody epitopes [209–211], stability [212] and immunogenicity [213], with some success. Excellent reviews on the application of different ML models, tools and design workflows applied to mAb engineering have been published recently [214, 215]. Ultimately, both physics-based and ML-based tools serve to minimize the empirical experimentation required and accelerate the drug discovery process to arrive at the final therapeutic molecule. Already computational protein design can result in success as outlined above with only a handful of experimentally validated designs.
However, MsAb design faces a number of challenges. It is important to note that the vast majority of computational design has been applied to BBs and not to MsAb assembly. Currently, numerous steps within the drug discovery process require empirical screening to discover and identify the optimal lead in terms of MsAb format, functional activity, developability, immunogenicity and pharmacokinetics. The assembly of MsAbs from BBs is a combinatorial problem, growing exponentially as more BBs and formats are considered. It is likely that only a very small fraction of all possible BB combinations in a limited number of MsAb formats will ever be tested experimentally. ML presents the opportunity to explore this astronomical space of possible formats and optimize MsAb properties at a holistic level. Moreover, the applicability of current ML models to MsAbs remains uncertain due to the lack of large datasets specifically focused on these molecules. Not surprisingly, the vast majority of data currently available are from mAbs or BBs. While many properties of individual BBs are likely pertinent to the MsAb format, others, particularly those related to the man-made quaternary structures of MsAbs as well as potential interactions between BBs, present unique challenges [216]. The interconnection of multiple BBs through flexible linkers, for example, may introduce new dynamics, steric hindrances and surface patches, impacting attributes such as expression levels, purification profiles, stability, half-life and immunogenicity [208, 213, 216, 217]. To address these challenges and develop reliable models for predicting MsAb properties, systematic data collection and curation of diverse MsAb formats with various BBs are crucial.
Several companies have already begun incorporating ML into their drug discovery workflows. For example, Generate Biomedicines (https://generatebiomedicines.com/) and A-Alphabio (https://www.aalphabio.com/) employ a combination of ML, protein design and large-scale experimental testing to discover new protein binders. Others, like Nabla Bio (https://www.nabla.bio/), use ML to predict and optimize the developability of mAbs. As more biopharmaceutical companies embrace ML, leveraging these technologies becomes increasingly important to remain competitive. Generating and curating the required experimental datasets is the first step in that process.
CONCLUSION
Here, we have provided a review of the status and challenges around the engineering of MsAbs, as well as insights to enable the next generation of MsAbs. Since the first approved MsAb in 2009, over a decade ago, the industry began to accept that a single format approach will largely fail to deliver on most of unmet medical needs. Instead, the ever-increasing variability in MsAbs’ formats serve as a valuable source of diversity that can be applied to the development of biologics for various indications. To reach such stage, the focus must be on the characterization of the BBs as well on the rules that guide their successful assembly into new NMEs. Consequently, a defining step to boost overall success of MsAbs is the selection of its BBs. Interestingly, in the event where a specific class of BBs is not available or cannot be generated for a given target, the MsAb format becomes dependent on the available BBs. As discussed previously, understanding the advantages as well as the disadvantages that each BB brings to the final molecule is paramount for the same to meet the TPP. While the initial MsAb format should be based on an intended biological effect, selection of BBs need to take into account BB properties (e.g. Tm, solubility, expression levels) as well as BB compatibility (e.g. pI). In addition, the revolution in ML-based tools is transforming drug discovery with these in silico steps that have been integrated into the workflows. These hybrid approaches will lead to smarter workflows, with faster timelines, and higher success rates.
ACKNOWLEDGEMENTS
The authors thank Hitisha Zaveri for assistance with the search and curation regarding the MsAbs in clinical trials. We thank Rajkumar Noubade, Oliver Nolan-Stevaux, Lei Chen and CM Hsieh for critically reading the review and provide insightful suggestions.
CONFLICT OF INTEREST STATEMENT
All authors are the employees of Gilead Sciences, Inc.
ETHICS AND CONSENT STATEMENT
No patient consent is required.
ANIMAL RESEARCH STATEMENT
Not applicable.
DATA AVAILABILITY
The authors confirm that the data supporting the findings of this study are available within the article.
FUNDING
The research was funded by Gilead Sciences, Inc.
AUTHOR CONTRIBUTIONS
All authors have contributed and reviewed the final version of this review.
References
Hie, BL, Shanker, VR, Xu, D et al.
Lair, L, Qureshi, I, Bechtold, C.
Piha-Paul, SA, Gupta, M, Oh, D-Y et al.