We report the results of the COVID Moonshot, a fully open-science, crowdsourced, and structure-enabled drug discovery campaign targeting the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) main protease. We discovered a noncovalent, nonpeptidic inhibitor scaffold with lead-like properties that is differentiated from current main protease inhibitors. Our approach leveraged crowdsourcing, machine learning, exascale molecular simulations, and high-throughput structural biology and chemistry. We generated a detailed map of the structural plasticity of the SARS-CoV-2 main protease, extensive structure-activity relationships for multiple chemotypes, and a wealth of biochemical activity data. All compound designs (>18,000 designs), crystallographic data (>490 ligand-bound x-ray structures), assay data (>10,000 measurements), and synthesized molecules (>2400 compounds) for this campaign were shared rapidly and openly, creating a rich, open, and intellectual property-free knowledge base for future anticoronavirus drug discovery.
A common challenge in drug design pertains to finding chemical modifications to a ligand that increases its affinity to the target protein. An underutilized advance is the increase in structural biology throughput, which has progressed from an artisanal endeavor to a monthly throughput of hundreds of different ligands against a protein in modern synchrotrons. However, the missing piece is a framework that turns high-throughput crystallography data into predictive models for ligand design. Here, we designed a simple machine learning approach that predicts protein- ligand affinity from experimental structures of diverse ligands against a single protein paired with biochemical measurements. Our key insight is using physics-based energy descriptors to represent protein-ligand complexes and a learning-to-rank approach that infers the relevant differences between binding modes. We ran a high-throughput crystallography campaign against the SARS-CoV-2 main protease (MPro), obtaining parallel measurements of over 200 protein-ligand complexes and their binding activities. This allows us to design one-step library syntheses which improved the potency of two distinct micromolar hits by over 10-fold, arriving at a noncovalent and nonpeptidomimetic inhibitor with 120 nM antiviral efficacy. Crucially, our approach successfully extends ligands to unexplored regions of the binding pocket, executing large and fruitful moves in chemical space with simple chemistry.
Electrophiles for covalent inhibitors that are suitable for in vivo administration are rare. While acrylamides are prevalent in FDA-approved covalent drugs, chloroacetamides are considered too reactive for such purposes. We report sulfamate-based electrophiles that maintain chloroacetamide-like geometry with tunable reactivity. In the context of the BTK inhibitor ibrutinib, sulfamate analogues showed low reactivity with comparable potency in protein labeling, in vitro, and cellular kinase activity assays and were effective in a mouse model of CLL. In a second example, we converted a chloroacetamide Pin1 inhibitor to a potent and selective sulfamate acetamide with improved buffer stability. Finally, we show that sulfamate acetamides can be used for covalent ligand-directed release (CoLDR) chemistry, both for the generation of "turn-on" probes as well as for traceless ligand-directed site-specific labeling of proteins. Taken together, this chemistry represents a promising addition to the list of electrophiles suitable for in vivo covalent targeting.
The COVID-19 pandemic has revealed the vulnerability of the modern, global society. With expected waves of future infections by SARS-CoV-2, treatment options for infected individuals will be crucial in order to decrease mortality and hospitalizations. The SARS-CoV-2 main protease is a validated drug target, for which the first inhibitor has been approved for use in patients. To facilitate future work on this drug target, we designed a solid-phase synthesis route towards azapeptide activity-based probes that are capped with a cysteine-reactive electrophile for covalent modification of the active site of Mpro. This design led to the most potent ABP for Mpro and one of the most potent inhibitors reported thus far. We demonstrate that this ABP can be used to visualize Mpro activity and target engagement by drugs in infected cells.
PARP16-the sole ER-resident PARP family member-is gaining attention as a potential therapeutic target for cancer treatment. Nevertheless, the precise function of the catalytic activity of PARP16 is poorly understood. This is primarily due to the lack of inhibitors that are selective for PARP16 over other PARP family members. Herein, we describe a structure-guided strategy for generating a selective PARP16 inhibitor by incorporating two selectivity determinants into a phthalazinone pan-PARP inhibitor scaffold: (i) an acrylamide-based inhibitor (DB008) designed to covalently react with a non-conserved cysteine (Cys169, human numbering) in the NAD(+) binding pocket of PARP16 and (ii) a dual-purpose ethynyl group designed to bind in a unique hydrophobic cavity adjacent to the NAD(+) binding pocket as well as serve as a click handle. DB008 exhibits good selectivity for PARP16 versus other PARP family members. Copper-catalyzed azide-alkyne cycloaddition (CuAAC) confirmed that covalent labeling of PARP16 by DB008 in cells is dependent on Cys169. DB008 exhibits excellent proteome-wide selectivity at concentrations required to achieve saturable labeling of endogenous PARP16. In-cell competition labeling experiments using DB008 provided a facile strategy for evaluating putative PARP16 inhibitors. Lastly, we found that PARP16 is sequestered into a detergent-insoluble fraction under prolonged amino acid starvation, and surprisingly, treatment with PARP16 inhibitors prevented this effect. These results suggest that the catalytic activity of PARP16 regulates its solubility in response to nutrient stress.
Parathyroid hormone (PTH) regulates calcium metabolism and bone strength. Chronic kidney disease leads to secondary hyperparathyroidism (SHP) which increases morbidity and mortality. In experimental SHP, the increased PTH gene expression is due to enhanced PTH mRNA stability, mediated by changes in its interaction with stabilizing AUF1 and destabilizing KSRP. Pin1 isomerizes target proteins and leads to KSRP dephosphorylation. In SHP, Pin1 isomerase activity is decreased and phosphorylated KSRP fails to bind PTH mRNA, resulting in high PTH mRNA stability and levels. However, the up- and down-stream mechanisms by which kidney failure stimulates the parathyroid remain elusive. We now reveal a pathway where kidney failure induces parathyroid Pin1 phosphorylation, linking Pin1, KSRP and PTH mRNA stability as key players for the high PTH expression in SHP. We identified by mass-spectrometry, changes in rat parathyroid proteome and phosphoproteome profiles induced by impaired renal function, including KSRP phosphorylation at Pin1 target sites. Furthermore, both acute and chronic kidney failure led to parathyroid-specific Pin1 Ser16 and Ser71 phosphorylation, which disrupts Pin1 activity. Accordingly, pharmacologic Pin1 inhibition, that mimics the decreased Pin1 activity in SHP, increased PTH expression ex-vivo in parathyroid organ cultures and in transfected cells, through the PTH mRNA protein-interacting element and KSRP phosphorylation at potential Pin1-binding motifs. Therefore, kidney failure leads to loss of parathyroid Pin1 activity by inducing Pin1 phosphorylation. This predisposes parathyroids to increase PTH production through impaired PTH mRNA decay that is dependent on KSRP phosphorylation. Pin1 and KSRP phosphorylation and the Pin1-KSRP-PTH mRNA axis thus drive SHP.
High-throughput nanomole-scale synthesis allows for late-stage functionalization (LSF) of compounds in an efficient and economical manner. Here, we demonstrated that copper-catalyzed azide–alkyne cycloaddition could be used for the LSF of covalent kinase inhibitors at the nanoscale, enabling the synthesis of hundreds of compounds that did not require purification for biological assay screening, thus reducing experimental time drastically. We generated crude libraries of inhibitors for the kinase MKK7, derived from two different parental precursors, and analyzed them via the high-throughput In-Cell Western assay. Select inhibitors were resynthesized, validated via conventional biological and biochemical methods such as western blots and liquid chromatography–mass spectrometry (LC-MS) labeling, and successfully co-crystallized. Two of these compounds showed over 20-fold increased inhibitory activity compared to the parental compound. This study demonstrates that high-throughput LSF of covalent inhibitors at the nanomole-scale level can be an auspicious approach in improving the properties of lead chemical matter.
The pentafluorosulfanyl (-SF5) functional group is of increasing interest as a bioisostere in medicinal chemistry. A library of SF5-containing compounds, including amide, isoxazole, and oxindole derivatives, was synthesised using a range of solution-based and solventless methods, including microwave and ball-mill techniques. The library was tested against targets including human dihydroorotate dehydrogenase (HDHODH). A subsequent focused approach led to synthesis of analogues of the clinically used disease modifying anti-rheumatic drugs (DMARDs), Teriflunomide and Leflunomide, considered for potential COVID-19 use, where SF5 bioisostere deployment led to improved inhibition of HDHODH compared with the parent drugs. The results demonstrate the utility of the SF5 group in medicinal chemistry.
The Protein Data Bank in Europe - Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive.
Designing covalent inhibitors is increasingly important, although it remains challenging. Here, we present covalentizer, a computational pipeline for identifying irreversible inhibitors based on structures of targets with non-covalent binders. Through covalent docking of tailored focused libraries, we identify candidates that can bind covalently to a nearby cysteine while preserving the interactions of the original molecule. We found ∼11,000 cysteines proximal to a ligand across 8,386 complexes in the PDB. Of these, the protocol identified 1,553 structures with covalent predictions. In a prospective evaluation, five out of nine predicted covalent kinase inhibitors showed half-maximal inhibitory concentration (IC50) values between 155 nM and 4.5 μM. Application against an existing SARS-CoV Mpro reversible inhibitor led to an acrylamide inhibitor series with low micromolar IC50 values against SARS-CoV-2 Mpro. The docking was validated by 12 co-crystal structures. Together these examples hint at the vast number of covalent inhibitors accessible through our protocol.
Chemical modifications of native proteins can affect their stability, activity, interactions, localization, and more. However, there are few nongenetic methods for the installation of chemical modifications at a specific protein site in cells. Here we report a covalent ligand directed release (CoLDR) site-specific labeling strategy, which enables the installation of a variety of functional tags on a target protein while releasing the directing ligand. Using this approach, we were able to label various proteins such as BTK, K-RasG12C, and SARS-CoV-2 PLpro with different tags. For BTK we have shown selective labeling in cells of both alkyne and fluorophores tags. Protein labeling by traditional affinity methods often inhibits protein activity since the directing ligand permanently occupies the target binding pocket. We have shown that using CoLDR chemistry, modification of BTK by these probes in cells preserves its activity. We demonstrated several applications for this approach including determining the half-life of BTK in its native environment with minimal perturbation, as well as quantification of BTK degradation by a noncovalent proteolysis targeting chimera (PROTAC) by in-gel fluorescence. Using an environment-sensitive “turn-on” fluorescent probe, we were able to monitor ligand binding to the active site of BTK. Finally, we have demonstrated efficient CoLDR-based BTK PROTACs (DC50
Pancreatic ductal adenocarcinoma (PDAC) is characterized by notorious resistance to current therapies attributed to inherent tumor heterogeneity and highly desmoplastic and immunosuppressive tumor microenvironment (TME). Unique proline isomerase Pin1 regulates multiple cancer pathways, but its role in the TME and cancer immunotherapy is unknown. Here, we find that Pin1 is overexpressed both in cancer cells and cancer-associated fibroblasts (CAFs) and correlates with poor survival in PDAC patients. Targeting Pin1 using clinically available drugs induces complete elimination or sustained remissions of aggressive PDAC by synergizing with anti-PD-1 and gemcitabine in diverse model systems. Mechanistically, Pin1 drives the desmoplastic and immunosuppressive TME by acting on CAFs and induces lysosomal degradation of the PD-1 ligand PD-L1 and the gemcitabine transporter ENT1 in cancer cells, besides activating multiple cancer pathways. Thus, Pin1 inhibition simultaneously blocks multiple cancer pathways, disrupts the desmoplastic and immunosuppressive TME, and upregulates PD-L1 and ENT1, rendering PDAC eradicable by immunochemotherapy.
The peptidyl-prolyl isomerase, Pin1, is exploited in cancer to activate oncogenes and inactivate tumor suppressors. However, despite considerable efforts, Pin1 has remained an elusive drug target. Here, we screened an electrophilic fragment library to identify covalent inhibitors targeting Pin1's active site Cys113, leading to the development of Sulfopin, a nanomolar Pin1 inhibitor. Sulfopin is highly selective, as validated by two independent chemoproteomics methods, achieves potent cellular and in vivo target engagement and phenocopies Pin1 genetic knockout. Pin1 inhibition had only a modest effect on cancer cell line viability. Nevertheless, Sulfopin induced downregulation of c-Myc target genes, reduced tumor progression and conferred survival benefit in murine and zebrafish models of MYCN-driven neuroblastoma, and in a murine model of pancreatic cancer. Our results demonstrate that Sulfopin is a chemical probe suitable for assessment of Pin1-dependent pharmacology in cells and in vivo, and that Pin1 warrants further investigation as a potential cancer drug target.
Electrophilic peptides that form an irreversible covalent bond with their target have great potential for binding targets that have been previously considered undruggable. However, the discovery of such peptides remains a challenge. Here, we present Rosetta CovPepDock, a computational pipeline for peptide docking that incorporates covalent binding between the peptide and a receptor cysteine. We applied CovPepDock retrospectively to a dataset of 115 disulfide-bound peptides and a dataset of 54 electrophilic peptides. It produced a top-five scoring, near-native model, in 89% and 100% of the cases when docking from the native conformation, and 20% and 90% when docking from an extended peptide conformation, respectively. In addition, we developed a protocol for designing electrophilic peptide binders based on known non-covalent binders or protein–protein interfaces. We identified 7154 peptide candidates in the PDB for application of this protocol. As a proof-of-concept we validated the protocol on the non-covalent complex of 14-3-3σ and YAP1 phosphopeptide. The protocol identified seven highly potent and selective irreversible peptide binders. The predicted binding mode of one of the peptides was validated using X-ray crystallography. This case-study demonstrates the utility and impact of CovPepDock. It suggests that many new electrophilic peptide binders can be rapidly discovered, with significant potential as therapeutic molecules and chemical probes.
The spillover of animal coronaviruses (aCoVs) to humans has caused SARS, MERS, and COVID-19. Although antibody responses displaying cross-reactivity between SARS-CoV-2 and seasonal/common cold human coronaviruses (hCoVs) have been reported, potential cross-reactivity with aCoVs and the diagnostic implications are incompletely understood. Here, we probed for antibody binding against all 7 hCoVs and 49 aCoVs represented as 12,924 peptides within a phage-displayed antigen library. Antibody repertoires of 269 recovered patients with COVID-19 showed distinct changes compared with 260 unexposed prepandemic controls, not limited to binding of SARS-CoV-2 antigens but including binding to antigens from hCoVs and aCoVs with shared motifs to SARS-CoV-2. We isolated broadly reactive monoclonal antibodies from recovered patients with COVID-19 who bind a shared motif of SARS-CoV-2, hCoV-OC43, hCoV-HKU1, and several aCoVs, demonstrating that interspecies cross-reactivity can be mediated by a single immunoglobulin. Using antibody binding data against the entire CoV antigen library allowed accurate discrimination of recovered patients with COVID-19 from unexposed individuals by machine learning. Leaving out SARS-CoV-2 antigens and relying solely on antibody binding to other hCoVs and aCoVs achieved equally accurate detection of SARS-CoV-2 infection. The ability to detect SARS-CoV-2 infection without knowledge of its unique antigens solely from cross-reactive antibody responses against other hCoVs and aCoVs suggests a potential diagnostic strategy for the early stage of future pandemics. Creating regularly updated antigen libraries representing the animal coronavirome can provide the basis for a serological assay already poised to identify infected individuals after a future zoonotic transmission event.
Targeted protein degradation offers several advantages over direct inhibition of protein activity and is gaining increasing interest in chemical biology and drug discovery. Proteolysis targeting chimeras (PROTACs) in particular are enjoying widespread application. However, PROTACs, which recruit an E3 ligase for degradation of a target protein, still suffer from certain challenges. These include a limited selection for E3 ligases on the one hand and the requirement for potent target binding on the other hand. Both issues restrict the target scope available for PROTACs. Degraders that covalently engage the target protein or the E3 ligase can potentially expand the pool of both targets and E3 ligases. Moreover, they may offer additional advantages by improving the kinetics of ternary complex formation or by endowing additional selectivity to the degrader. Here, we review the recent progress in the emerging field of covalent PROTACs.
Electrophilic natural products (ENPs) are a rich source of bioactive molecules with tremendous therapeutic potential. While their synthetic complexity may hinder their direct use as therapeutics, they represent tools for elucidation of suitable molecular targets and serve as inspiration for the design of simplified synthetic counterparts. Here, we review the recent use of various activity-based protein profiling methods to uncover molecular targets of ENPs. Beyond target identification, these examples also showcase further development of synthetic ligands from natural product starting points. Two examples demonstrate how ENPs can progress the emerging fields of targeted protein degradation and molecular glues. Though challenges still remain in the synthesis of ENP-based probes, and in their synthetic simplification, their potential for discovery of novel mechanisms of action makes it well worth the effort.
The SARS-CoV-2 main viral protease (Mpro) is an attractive target for antivirals given its distinctiveness from host proteases, essentiality in the viral life cycle and conservation across coronaviridae. We launched the COVID Moonshot initiative to rapidly develop patent-free antivirals with open science and open data. Here we report the use of machine learning for de novo design, coupled with synthesis route prediction, in our campaign. We discover novel chemical scaffolds active in biochemical and live virus assays, synthesized with model generated routes.
Understanding protein-ligand interactions in a cellular context is an important goal in molecular biology and biochemistry, and particularly for drug development. Investigators must demonstrate that drugs penetrate cells and specifically bind their targets. Towards that end, we present a native mass spectrometry (MS)-based method for analyzing drug uptake and target engagement in eukaryotic cells. This method is based on our previously introduced direct-MS method for rapid analysis of proteins directly from crude samples. Here, direct-MS enables label-free studies of protein-drug binding in human cells and is used to determine binding affinities of lead compounds in crude samples. We anticipate that this method will enable the application of native MS to a range of problems where cellular context is important, including protein-protein interactions, drug uptake and binding, and characterization of therapeutic proteins.
In early 2020, a spontaneous global collaboration came together to design a new, urgent antiviral treatment. There are lessons in what happened next.
Proteolysis targeting chimeras (PROTACs) are small molecules that form ternary complexes between their target and E3 ligase, resulting in ubiquitination and proteasomal degradation of the target protein. Using our own designed Bruton’s tyrosine kinase (BTK) PROTAC compounds, we show herein efficient BTK degradation in chronic lymphocytic leukemia (CLL) cells. The reversible non-covalent compound (NC-1) was the most potent and therefore we focused on this PROTAC to investigate its subsequent effects on the BCR pathway. NC-1 decreased baseline BTK phosphorylation as well as activation of BTK and other signaling molecules downstream of the BCR pathway, following IgM engagement. These effects were also obtained in samples from CLL patients with clinical resistance to ibrutinib and mutations at C481. NC-1 treatment further decreased baseline CD69 surface levels, completely abrogated its upregulation following IgM activation, decreased CLL cells migration toward SDF-1 and overcame stromal anti-apoptotic protection. In conclusion, our results indicate that targeting BTK using the PROTAC strategy could be a potential novel therapeutic approach for CLL.
Targeted covalent inhibitors are an important class of drugs and chemical probes. However, relatively few electrophiles meet the criteria for successful covalent inhibitor design. Here we describe α-substituted methacrylamides as a new class of electrophiles suitable for targeted covalent inhibitors. While typically α-substitutions inactivate acrylamides, we show that hetero α-substituted methacrylamides have higher thiol reactivity and undergo a conjugated addition–elimination reaction ultimately releasing the substituent. Their reactivity toward thiols is tunable and correlates with the pKa/pKb of the leaving group. In the context of the BTK inhibitor ibrutinib, these electrophiles showed lower intrinsic thiol reactivity than the unsubstituted ibrutinib acrylamide. This translated to comparable potency in protein labeling, in vitro kinase assays, and functional cellular assays, with improved selectivity. The conjugate addition–elimination reaction upon covalent binding to their target cysteine allows functionalizing α-substituted methacrylamides as turn-on probes. To demonstrate this, we prepared covalent ligand directed release (CoLDR) turn-on fluorescent probes for BTK, EGFR, and K-RasG12C. We further demonstrate a BTK CoLDR chemiluminescent probe that enabled a high-throughput screen for BTK inhibitors. Altogether we show that α-substituted methacrylamides represent a new and versatile addition to the toolbox of targeted covalent inhibitor design.
Covalent inhibitors have several attractive features as chemical probes and drugs. Historically, the common way to develop covalent inhibitors was to derivatize a known non-covalent inhibitor or substrate. In recent years, however, covalent fragment screening is establishing itself as an alternative, perhaps more suitable for challenging targets, with no previous chemical matter. Fragments that can irreversibly bind their target can overcome the low affinity that limits reversible fragment screening, and offer a very direct and general binding assay via mass spectrometry. All the while, they retain a high probability of binding to target proteins and efficient coverage of chemical space due to their small size. As this method is increasingly utilized, insights were gained regarding the most suitable electrophiles for covalent fragment libraries, methods to deal with intrinsic reactivity and notable successful applications were reported. Here we review recent efforts in this field in an attempt to enable broader and more efficient use of covalent fragment screening.
The ubiquitous heat shock protein 70 (HSP70) family consists of ATP-dependent molecular chaperones, which perform numerous cellular functions that affect almost all aspects of the protein life cycle from synthesis to degradation1–3. Achieving this broad spectrum of functions requires precise regulation of HSP70 activity. Proteins of the HSP40 family, also known as J-domain proteins (JDPs), have a key role in this process by preselecting substrates for transfer to their HSP70 partners and by stimulating the ATP hydrolysis of HSP70, leading to stable substrate binding3,4. In humans, JDPs constitute a large and diverse family with more than 40 different members2, which vary in their substrate selectivity and in the nature and number of their client-binding domains5. Here we show that JDPs can also differ fundamentally in their interactions with HSP70 chaperones. Using nuclear magnetic resonance spectroscopy6,7 we find that the major class B JDPs are regulated by an autoinhibitory mechanism that is not present in other classes. Although in all JDPs the interaction of the characteristic J-domain is responsible for the activation of HSP70, in DNAJB1 the HSP70-binding sites in this domain are intrinsically blocked by an adjacent glycine-phenylalanine rich region—an inhibition that can be released upon the interaction of a second site on DNAJB1 with the HSP70 C-terminal tail. This regulation, which controls substrate targeting to HSP70, is essential for the disaggregation of amyloid fibres by HSP70–DNAJB1, illustrating why no other class of JDPs can substitute for class B in this function. Moreover, this regulatory layer, which governs the functional specificities of JDP co-chaperones and their interactions with HSP70s, could be key to the wide range of cellular functions of HSP70.
Mitogen-activated protein kinase kinase 7 (MAP2K7) in the c-Jun N-terminal kinase signal cascade is an attractive drug target for a variety of diseases. The selectivity of MAP2K7 inhibitors against off-target kinases is a major barrier in drug development. We report a crystal structure of MAP2K7 complexed with a potent covalent inhibitor bearing an acrylamide moiety as an electrophile, which discloses a structural basis for producing selective and potent MAP2K7 inhibitors.
COVID-19, caused by SARS-CoV-2, lacks effective therapeutics. Additionally, no antiviral drugs or vaccines were developed against the closely related coronavirus, SARS-CoV-1 or MERS-CoV, despite previous zoonotic outbreaks. To identify starting points for such therapeutics, we performed a large-scale screen of electrophile and non-covalent fragments through a combined mass spectrometry and X-ray approach against the SARS-CoV-2 main protease, one of two cysteine viral proteases essential for viral replication. Our crystallographic screen identified 71 hits that span the entire active site, as well as 3 hits at the dimer interface. These structures reveal routes to rapidly develop more potent inhibitors through merging of covalent and non-covalent fragment hits; one series of low-reactivity, tractable covalent fragments were progressed to discover improved binders. These combined hits offer unprecedented structural and reactivity information for on-going structure-based drug design against SARS-CoV-2 main protease.
Proteolysis-targeting chimeras (PROTACs), which induce degradation by recruitment of an E3 ligase to a target protein, are gaining much interest as a new pharmacological modality. However, designing PROTACs is challenging. Formation of a ternary complex between the protein target, the PROTAC and the recruited E3 ligase is considered paramount for successful degradation. A structural model of this ternary complex could in principle inform rational PROTAC design. Unfortunately, only a handful of structures are available for such complexes, necessitating tools for their modeling. We developed a combined protocol for the modeling of a ternary complex induced by a given PROTAC. Our protocol alternates between sampling of the protein-protein interaction space and the PROTAC molecule conformational space. Application of this protocol - PRosettaC - to a benchmark of known PROTAC ternary complexes results in near-native predictions, with often atomic accuracy prediction of the protein chains, as well as the PROTAC binding moieties. It allowed the modeling of a CRBN/BTK complex that recapitulated experimental results for a series of PROTACs. PRosettaC generated models may be used to design PROTACs for new targets, as well as improve PROTACs for existing targets, potentially cutting down time and synthesis efforts. To enable wide access to this protocol we have made it available through a web-server (http://prosettac.weizmann.ac.il/).
PROteolysis Targeting Chimeras (PROTACs) represent an exciting inhibitory modality with many advantages, including sub-stoichiometric degradation of targets. Their scope, though, is still limited to-date by the requirement for a sufficiently potent target binder. A solution that proved useful in tackling challenging targets is the use of electrophiles to allow irreversible binding to the target. However, such binding will negate the catalytic nature of PROTACs. Reversible covalent PROTACs potentially offer the best of both worlds. They possess the potency and selectivity associated with the formation of the covalent bond, while being able to dissociate and regenerate once the protein target is degraded. Using Bruton’s tyrosine kinase (BTK) as a clinically relevant model system, we show efficient covalent degradation by non-covalent, irreversible covalent and reversible covalent PROTACs, with 85% degradation. Our data suggests that part of the degradation by our irreversible covalent PROTACs is driven by reversible binding prior to covalent bond formation, while the reversible covalent PROTACs drive degradation primarily by covalent engagement. The PROTACs showed enhanced inhibition of B cell activation compared to Ibrutinib, and exhibit potent degradation of BTK in patients-derived primary chronic lymphocytic leukemia cells. The most potent reversible covalent PROTAC, RC-3, exhibited enhanced selectivity towards BTK compared to non-covalent and irreversible covalent PROTACs. These compounds may pave the way for the design of covalent PROTACs for a wide variety of challenging targets.
KRAS, one of the most prevalent oncogenes and sought-after anticancer targets, has eluded chemists for decades until an irreversible covalent strategy targeting a specific mutation (G12C) paved the way for the first KRAS inhibitors to reach the clinic. MRTX849 is one such clinical candidate with promising initial results in patients harboring the mutation. The impressive optimization story of MRTX849 highlights challenges and solutions in the development of covalent drugs, including the use of an α-fluoroacrylamide electrophile.
The Rosetta software for macromolecular modeling, docking and design is extensively used in laboratories worldwide. During two decades of development by a community of laboratories at more than 60 institutions, Rosetta has been continuously refactored and extended. Its advantages are its performance and interoperability between broad modeling capabilities. Here we review tools developed in the last 5 years, including over 80 methods. We discuss improvements to the score function, user interfaces and usability. Rosetta is available at http://www.rosettacommons.org.
Peptidyl-prolyl cis/trans isomerase NIMA-interacting 1 (Pin1) is commonly overexpressed in human cancers, including pancreatic ductal adenocarcinoma (PDAC). While Pin1 is dispensable for viability in mice, it is required for activated Ras to induce tumorigenesis, suggesting a role for Pin1 inhibitors in Ras-driven tumors, such as PDAC. We report the development of rationally designed peptide inhibitors that covalently target Cys113, a highly conserved cysteine located in the Pin1 active site. The inhibitors were iteratively optimized for potency, selectivity and cell permeability to give BJP-06-005-3, a versatile tool compound with which to probe Pin1 biology and interrogate its role in cancer. In parallel to inhibitor development, we employed genetic and chemical-genetic strategies to assess the consequences of Pin1 loss in human PDAC cell lines. We demonstrate that Pin1 cooperates with mutant KRAS to promote transformation in PDAC, and that Pin1 inhibition impairs cell viability over time in PDAC cell lines.
Bruton's tyrosine kinase (BTK) is a major drug target for B-cell related malignancies; however, existing BTK inhibitors approved for cancer treatment have significant off-targets that limit their use for autoimmune and inflammatory diseases. Remibrutinib (LOU064) is a novel covalent BTK inhibitor that binds an inactive BTK conformation, which affords it unprecedented selectivity. Its optimization led to rapid BTK engagement in vivo and fast clearance, further limiting systemic exposure. Remibrutinib is currently in phase 2 clinical trials for treatment of chronic urticaria and Sjoegren's syndrome.
Outbreaks of human epidemic nonbacterial gastroenteritis are mainly caused by noroviruses. Viral replication requires a 3C-like cysteine protease (3CLpro) which processes the 200 kDa viral polyprotein into six functional proteins. The 3CLpro has attracted much interest due to its potential as a target for antiviral drugs. A system for growing high-quality crystals of native Southampton norovirus 3CLpro (SV3CP) has been established, allowing the ligand-free crystal structure to be determined to 1.3 Å in a tetrameric state. This also allowed crystal-based fragment screening to be performed with various compound libraries, ultimately to guide drug discovery for SV3CP. A total of 19 fragments were found to bind to the protease out of the 844 which were screened. Two of the hits were located at the active site of SV3CP and showed good inhibitory activity in kinetic assays. Another 5 were found at the enzyme’s putative RNA-binding site and a further 11 were located in the symmetric central cavity of the tetramer.
FAT10 is a ubiquitin-like protein suggested to target proteins for proteasomal degradation. It is highly upregulated upon pro-inflammatory cytokines, namely, TNF alpha, IFN gamma, and IL6, and was found to be highly expressed in various epithelial cancers. Evidence suggests that FAT10 is involved in cancer development and may have a pro-tumorigenic role. However, its biological role is still unclear, as well as its biochemical and cellular regulation. To identify pathways underlying FAT10 expression in the context of pro-inflammatory stimulation, which characterizes the cancerous environment, we implemented a phenotypic transcriptional reporter screen with a library of annotated compounds. We identified AZ960, a potent JAK2 inhibitor, which significantly downregulates FAT10 under pro-inflammatory cytokines induction, in an NF kappa B-independent manner. We validated JAK2 as a major regulator of FAT10 expression via knockdown, and we suggest that the transcriptional effects are mediated through pSTAT1/3/5. Overall, we have elucidated a pathway regulating FAT10 transcription and discovered a tool compound to chemically downregulate FAT10 expression, and to further study its biology.
Insecticides allow control of agricultural pests and disease vectors and are vital for global food security and health. The evolution of resistance to insecticides, such as organophosphates (OPs), is a serious and growing concern. OP resistance often involves sequestration or hydrolysis of OPs by carboxylesterases. Inhibiting carboxylesterases could, therefore, restore the effectiveness of OPs for which resistance has evolved. Here, we use covalent virtual screening to produce nano-/picomolar boronic acid inhibitors of the carboxylesterase alpha E7 from the agricultural pest Lucilia cuprina as well as a common Gly137Asp alpha E7 mutant that confers OP resistance. These inhibitors, with high selectivity against human acetylcholinesterase and low to no toxicity in human cells and in mice, act synergistically with the OPs diazinon and malathion to reduce the amount of OP required to kill L. cuprina by up to 16-fold and abolish resistance. The compounds exhibit broad utility in significantly potentiating another OP, chlorpyrifos, against the common pest, the peach-potato aphid (Myzus persicae). These compounds represent a solution to OP resistance as well as to environmental concerns regarding overuse of OPs, allowing significant reduction of use without compromising efficacy.
Covalent probes can display unmatched potency, selectivity, and duration of action; however, their discovery is challenging. In principle, fragments that can irreversibly bind their target can overcome the low affinity that limits reversible fragment screening, but such electrophilic fragments were considered nonselective and were rarely screened. We hypothesized that mild electrophiles might overcome the selectivity challenge and constructed a library of 993 mildly electrophilic fragments. We characterized this library by a new high-throughput thiol-reactivity assay and screened them against 10 cysteine-containing proteins. Highly reactive and promiscuous fragments were rare and could be easily eliminated. In contrast, we found hits for most targets. Combining our approach with high-throughput crystallography allowed rapid progression to potent and selective probes for two enzymes, the deubiquitinase OTUB2 and the pyrophosphatase NUDT7. No inhibitors were previously known for either. This study highlights the potential of electrophile-fragment screening as a practical and efficient tool for covalent-ligand discovery.
The c-Jun NH2-terminal kinase (JNK) signaling pathway is central to the cell response to stress, inflammatory signals, and toxins. While selective inhibitors are known for JNKs and for various upstream MAP3Ks, no selective inhibitor is reported for MKK7––one of two direct MAP2Ks that activate JNK. Here, using covalent virtual screening, we identify selective MKK7 covalent inhibitors. We optimized these compounds to low-micromolar inhibitors of JNK phosphorylation in cells. The crystal structure of a lead compound bound to MKK7 demonstrated that the binding mode was correctly predicted by docking. We asserted the selectivity of our inhibitors on a proteomic level and against a panel of 76 kinases, and validated an on-target effect using knockout cell lines. Lastly, we show that the inhibitors block activation of primary mouse B cells by lipopolysaccharide. These MKK7 tool compounds will enable better investigation of JNK signaling and may serve as starting points for therapeutics.
The success of targeted covalent inhibitors in the global pharmaceutical industry has led to a resurgence of covalent drug discovery. However, covalent inhibitor design for flexible binding sites remains a difficult task due to a lack of methodological development. Here, we compared covalent docking to empirical electrophile screening against the highly dynamic target K-Ras(G12C). While the overall hit rate of both methods was comparable, we were able to rapidly progress a docking hit to a potent irreversible covalent binder that modifies the inactive, GDP-bound state of K-Ras(G12C). Hydrogen-deuterium exchange mass spectrometry was used to probe the protein dynamics of compound binding to the switch-II pocket and subsequent destabilization of the nucleotide-binding region. SOS-mediated nucleotide exchange assays showed that, contrary to prior switch-II pocket inhibitors, these new compounds appear to accelerate nucleotide exchange. This study highlights the efficiency of covalent docking as a tool for the discovery of chemically novel hits against challenging targets.
Chemokines orchestrate cellmigration for development, immune surveillance, and disease by binding to cell surface heterotrimeric guanine nucleotide-binding protein (G protein)-coupled receptors (GPCRs). The array of interactions between the nearly 50 chemokines and their 20 GPCR targets generates an extensive signaling network to which promiscuity and biased agonism add further complexity. The receptor CXCR4 recognizes both monomeric and dimeric forms of the chemokine CXCL12, which is a distinct example of ligand bias in the chemokine family. We demonstrated that a constitutively monomeric CXCL12 variant reproduced the G protein-dependent and b-arrestin-dependent responses that are associated with normal CXCR4 signaling and lead to cell migration. In addition, monomeric CXCL12 made specific contacts with CXCR4 that are not present in the structure of the receptor in complex with a dimeric form of CXCL12, a biased agonist that stimulates only G protein-dependent signaling. We produced an experimentally validated model of an agonist-bound chemokine receptor that merged a nuclear magnetic resonance-based structure of monomeric CXCL12 bound to the amino terminus of CXCR4 with a crystal structure of the transmembrane domains of CXCR4. The large CXCL12: CXCR4 protein-protein interface revealed by this structure identified previously uncharacterized functional interactions that fall outside of the classical "two-site model" for chemokine-receptor recognition. Our model suggests a mechanistic hypothesis for how interactions on the extracellular face of the receptor may stimulate the conformational changes required for chemokine receptor-mediated signal transduction.
This volume covers an array of techniques available for studying peptide-protein docking and design. The book is divided into four sections: peptide binding site prediction; peptide-protein docking; prediction and design of peptide binding specificity; and the design of inhibitory peptides. The chapters in Modeling Peptide-Protein Interactions: Methods and Protocols cover topics such as the usage of ACCLUSTER and PeptiMap for peptide binding site prediction; AnchorDock and ATTRACT for blind, flexible docking of peptides to proteins; flexible peptide docking using HADDOCK and FlexPepDock; identifying loop-mediated protein-protein interactions using LoopFinder; and protein-peptide interaction design using PinaColada. Written in the highly successful Methods in Molecular Biology series format, chapters include introductions to their respective topics, lists of the necessary details for successful application of the different approaches and step-by-step, readily reproducible protocols, as well as tips on troubleshooting and avoiding known pitfalls.Cutting-edge and thorough, Modeling Peptide-Protein Interactions: Methods and Protocols provides a diverse and unified overview of this rapidly advancing field of major interest and applicability.
Galactofuranose (Galf) is present in glycans critical for the virulence and viability of several pathogenic microbes, including Mycobacterium tuberculosis, yet the monosaccharide is absent from mammalian glycans. Uridine 5'-diphosphate-galactopyranose mutase (UGM) catalyzes the formation of UDP-Galf, which is required to produce Galf-containing glycoconjugates. Inhibitors of UGM have therefore been sought, both as antimicrobial leads and as tools to delineate the roles of Galf in cells. Obtaining cell permeable UGM probes by either design or high throughput screens has been difficult, as has elucidating how UGM binds small molecule, noncarbohydrate inhibitors. To address these issues, we employed structure-based virtual screening to uncover new inhibitor chemotypes, including a triazolothiadiazine series. These compounds are among the most potent antimycobacterial UGM inhibitors described. They also facilitated determination of a UGM small molecule inhibitor structure, which can guide optimization. A comparison of results from the computational screen and a high-throughput fluorescence polarization (FP) screen indicated that the scaffold hits from the former had been evaluated in the FP screen but missed. By focusing on promising compounds, the virtual screen rescued false negatives, providing a blueprint for generating new UGM probes and therapeutic leads.
Enzyme function prediction remains an important open problem. Though structure-based modeling, such as metabolite docking, can identify substrates of some enzymes, it is ill-suited to reactions that progress through a covalent intermediate. Here we investigated the ability of covalent docking to identify substrates that pass through such a covalent intermediate, focusing particularly on the haloalkanoate dehalogenase superfamily. In retrospective assessments, covalent docking recapitulated substrate binding modes of known cocrystal structures and identified experimental substrates from a set of putative phosphorylated metabolites. In comparison, noncovalent docking of high-energy intermediates yielded nonproductive poses. In prospective predictions against seven enzymes, a substrate was identified for five. For one of those cases, a covalent docking prediction, confirmed by empirical screening, and combined with genomic context analysis, suggested the identity of the enzyme that catalyzes the orphan phosphatase reaction in the riboflavin biosynthetic pathway of Bacteroides.
Chemical probes that form a covalent bond with a protein target often show enhanced selectivity, potency and utility for biological studies. Despite these advantages, protein-reactive compounds are usually avoided in high-throughput screening campaigns. Here we describe a general method (DOCKovalent) for screening large virtual libraries of electrophilic small molecules. We apply this method prospectively to discover reversible covalent fragments that target distinct protein nucleophiles, including the catalytic serine of AmpC beta- lactamase and noncatalytic cysteines in RSK2, MSK1 and JAK3 kinases. We identify submicromolar to low-nanomolar hits with high ligand efficiency, cellular activity and selectivity, including what are to our knowledge the first reported reversible covalent inhibitors of JAK3. Crystal structures of inhibitor complexes with AmpC and RSK2 confirm the docking predictions and guide further optimization. As covalent virtual screening may have broad utility for the rapid discovery of chemical probes, we have made the method freely available through an automated web server (covalent.docking.org/).
Toll-like receptor 4 (TLR4) plays an important role in the regulation of the innate and adaptive immune response. Both agonists and antagonists of TLR4 are of considerable interest as drug leads for various disease indications. We herein report the rational design of two myeloid differentiation factor 2 (MD2)-derived macrocyclic peptides as TLR4 modulators, using the Rosetta Macromolecular Modeling software. The designed cyclic peptides, but not their linear counterparts, displayed synergistic activation of TLR signaling when co-administered with lipopolysaccharide (LPS). Although the understanding of the mechanism of action of these peptides remains elusive, these results underscore the utility of peptide cyclization for the discovery of biologically active agents, and also provide valuable tools for the investigation of TLR4 signaling.
Enzymes in the glutathione transferase (GST) superfamily catalyze the conjugation of glutathione (GSH) to electrophilic substrates. As a consequence they are involved in a number of key biological processes, including protection of cells against chemical damage, steroid and prostaglandin biosynthesis, tyrosine catabolism, and cell apoptosis. Although virtual screening has been used widely to discover substrates by docking potential noncovalent ligands into active site clefts of enzymes, docking has been rarely constrained by a covalent bond between the enzyme and ligand. In this study, we investigate the accuracy of docking poses and substrate discovery in the GST superfamily, by docking 6738 potential ligands from the KEGG and MetaCyc compound libraries into 14 representative GST enzymes with known structures and substrates using the PLOP program [Jacobson et al. Proteins 2004, SS, 351]. For X-ray structures as receptors, one of the top 3 ranked models is within 3 angstrom all-atom root mean square deviation (RMSD) of the native complex in 11 of the 14 cases; the enrichment LogAUC value is better than random in all cases, and better than 25 in 7 of 11 cases. For comparative models as receptors, near-native ligand-enzyme configurations are often sampled but difficult to rank highly. For models based on templates with the highest sequence identity, the enrichment LogAUC is better than 25 in 5 of 11 cases, not significantly different from the crystal structures. In conclusion, we show that covalent docking can be a useful tool for substrate discovery and point out specific challenges for future method improvement.
Computational protein design efforts aim to create novel proteins and functions in an automated manner and, in the process, these efforts shed light on the factors shaping natural proteins. The focus of these efforts has progressed from the interior of proteins to their surface and the design of functions, such as binding or catalysis. Here we examine progress in the development of robust methods for the computational design of non-natural interactions between proteins and molecular targets such as other proteins or small molecules. This problem is referred to as the de nova computational design of interactions. Recent successful efforts in de nova enzyme design and the de novo design of protein-protein interactions open a path towards solving this problem. We examine the common themes in these efforts, and review recent studies aimed at understanding the nature of successes and failures in the de nova computational design of interactions. While several approaches culminated in success, the use of a well-defined structural model for a specific binding interaction in particular has emerged as a key strategy for a successful design, and is therefore reviewed with special consideration.
Protein–Protein Interactions (PPIs) mediate numerous biological functions. As such, the inhibition of specific PPIs has tremendous therapeutic value. The notion that these interactions are ‘undruggable’ has petered out with the emergence of more and more successful examples of PPI inhibitors, expanding considerably the scope of potential drug targets. The accumulated data on successes in the inhibition of PPIs allow us to analyze the features that are required for such inhibition. Whereas it has been suggested and shown that targeting hot spots at PPI interfaces is a good strategy to achieve inhibition, in this review we focus on the notion that the most amenable interactions for inhibition are those that are mediated by a ‘hot segment’, a continuous epitope that contributes the majority of the binding energy. This criterion is both useful in guiding future target selection efforts, and in suggesting immediate inhibitory candidates – the dominant peptidic segment that mediates the targeted interaction.
Peptide-mediated interactions are gaining increased attention due to their predominant roles in the many regulatory processes that involve dynamic interactions between proteins. The structures of such interactions provide an excellent starting point for their characterization and manipulation, and can provide leads for targeted inhibitor design. The relatively few experimentally determined structures of peptide–protein complexes can be complemented with an outburst of modeling approaches that have been introduced in recent years, with increasing accuracy and applicability to ever more systems. We review different methods to address the considerable challenges in modeling the binding of a short yet highly flexible peptide to its partner. These methods apply an array of sampling strategies and draw from a recent amassing of knowledge about the biophysical nature of peptide–protein interactions. We elaborate on applications of these structure-based approaches and in particular on the characterization of peptide binding specificity to different peptide-binding domains and enzymes. Such applications can identify new biological targets and thus complement our current view of protein–protein interactions in living organisms. Accurate peptide–protein docking is of particular importance in the light of increased appreciation of the crucial functional roles of disordered regions and the many linear binding motifs embedded within.
Finding bacterial cellular targets for developing novel antibiotics has become a major challenge in fighting resistant pathogenic bacteria. We present a novel compound, Relacin, designed to inhibit (p) ppGpp production by the ubiquitous bacterial enzyme RelA that triggers the Stringent Response. Relacin inhibits RelA in vitro and reduces (p) ppGpp production in vivo. Moreover, Relacin affects entry into stationary phase in Gram positive bacteria, leading to a dramatic reduction in cell viability. When Relacin is added to sporulating Bacillus subtilis cells, it strongly perturbs spore formation regardless of the time of addition. Spore formation is also impeded in the pathogenic bacterium Bacillus anthracis that causes the acute anthrax disease. Finally, the formation of multicellular biofilms is markedly disrupted by Relacin. Thus, we establish that Relacin, a novel ppGpp analogue, interferes with bacterial long term survival strategies, placing it as an attractive new antibacterial agent.
Interactions between Bcl-2-like proteins and BH3 domains play a key role in the regulation of apoptosis. Despite the overall structural similarity of their interaction with helical BH3 domains, Bcl-2-like proteins exhibit an intricate spectrum of binding specificities whose underlying basis is not well understood. Here, we characterize these interactions using Rosetta FlexPepBind, a protocol for the prediction of peptide binding specificity that evaluates the binding potential of different peptides based on structural models of the corresponding peptide receptor complexes. For two prominent players, Bcl-xL and Mcl-1, we obtain good agreement with a large set of experimental SPOT array measurements and recapitulate the binding specificity of peptides derived by yeast display in a previous study. We extend our approach to a third member of this family, Bcl-2: we test our blind prediction of the binding of 180 BIM-derived peptides with a corresponding experimental SPOT array. Both prediction and experiment reveal a Bcl-2 binding specificity pattern that resembles that of Bcl-xL. Finally, we extend this application to accurately predict the specificity pattern of additional human BH3-only derived peptides. This study characterizes the distinct patterns of binding specificity of BH3-only derived peptides for the Bcl-2 like proteins Bcl-xL, Mcl-1, and Bcl-2 and provides insight into the structural basis of determinants of specificity.
Post-translational lipidation by prenylation of the CaaX-box C-terminal motif in eukaryotic proteins facilitates anchoring of hydrophilic proteins, such as Ras and Rab, to membranes. A large cadre of bacterial effectors injected into host cells is anchored to host membranes by unknown mechanisms. As already documented for Legionella and Salmonella, we propose a common paradigm of microbial exploitation of the host prenylation machinery for anchoring of injected effectors to host membranes. This is supported by numerous potential microbial CaaX-box-containing proteins identified using refined bioinformatic tools. We also propose utilization of the CaaX motif as a membrane-targeting tag for proteins expressed in eukaryotic cells to facilitate deciphering of biological function.
The CAPRI (Critical Assessment of Predicted Interactions) and CASP (Critical Assessment of protein Structure Prediction) experiments have demonstrated the power of community-wide tests of methodology in assessing the current state of the art and spurring progress in the very challenging areas of protein docking and structure prediction. We sought to bring the power of community-wide experiments to bear on a very challenging protein design problem that provides a complementary but equally fundamental test of current understanding of protein-binding thermodynamics. We have generated a number of designed protein-protein interfaces with very favorable computed binding energies but which do not appear to be formed in experiments, suggesting that there may be important physical chemistry missing in the energy calculations. A total of 28 research groups took up the challenge of determining what is missing: we provided structures of 87 designed complexes and 120 naturally occurring complexes and asked participants to identify energetic contributions and/or structural features that distinguish between the two sets. The community found that electrostatics and solvation terms partially distinguish the designs from the natural complexes, largely due to the nonpolar character of the designed interactions. Beyond this polarity difference, the community found that the designed binding surfaces were, on average, structurally less embedded in the designed monomers, suggesting that backbone conformational rigidity at the designed surface is important for realization of the designed function. These results can be used to improve computational design strategies, but there is still much to be learned; for example, one designed complex, which does form in experiments, was classified by all metrics as a nonbinder. (C) 2011 Elsevier Ltd. All rights reserved.
Farnesylation is an important post-translational modification catalyzed by farnesyltransferase (FTase). Until recently it was believed that a C-terminal CaaX motif is required for farnesylation, but recent experiments have revealed larger substrate diversity. In this study, we propose a general structural modeling scheme to account for peptide binding specificity and recapitulate the experimentally derived selectivity profile of FTase in vitro. In addition to highly accurate recovery of known FTase targets, we also identify a range of novel potential targets in the human genome, including a new substrate class with an acidic C-terminal residue (CxxD/E). In vitro experiments verified farnesylation of 26/29 tested peptides, including both novel human targets, as well as peptides predicted to tightly bind FTase. This study extends the putative range of biological farnesylation substrates. Moreover, it suggests that the ability of a peptide to bind FTase is a main determinant for the farnesylation reaction. Finally, simple adaptation of our approach can contribute to more accurate and complete elucidation of peptide-mediated interactions and modifications in the cell.
Peptide–protein interactions are among the most prevalent and important interactions in the cell, but a large fraction of those interactions lack detailed structural characterization. The Rosetta FlexPepDock web server (http://flexpepdock.furmanlab.cs.huji.ac.il/) provides an interface to a high-resolution peptide docking (refinement) protocol for the modeling of peptide–protein complexes, implemented within the Rosetta framework. Given a protein receptor structure and an approximate, possibly inaccurate model of the peptide within the receptor binding site, the FlexPepDock server refines the peptide to high resolution, allowing full flexibility to the peptide backbone and to all side chains. This protocol was extensively tested and benchmarked on a wide array of non-redundant peptide–protein complexes, and was proven effective when applied to peptide starting conformations within 5.5 Å backbone root mean square deviation from the native conformation. FlexPepDock has been applied to several systems that are mediated and regulated by peptide–protein interactions. This easy to use and general web server interface allows non-expert users to accurately model their specific peptide–protein interaction of interest.
Flexible peptides that fold upon binding to another protein molecule mediate a large number of regulatory interactions in the living cell and may provide highly specific recognition modules. We present Rosetta FlexPepDockab-initio, a protocol for simultaneous docking and de-novo folding of peptides, starting from an approximate specification of the peptide binding site. Using the Rosetta fragments library and a coarse-grained structural representation of the peptide and the receptor, FlexPepDockab-initio samples efficiently and simultaneously sampled the space of possible peptide backbone conformations and rigid-body orientations over the receptor surface of a given binding site. The subsequent all-atom refinement of the coarse-grained models includes full side-chain modeling of both the receptor and the peptide, resulting in high-resolution models in which key side-chain interactions are recapitulated. The protocol was applied to a benchmark in which peptides were modeled over receptors in either their bound backbone conformations or in their free, unbound form. Near-native peptide conformations were identified in 18/26 of the bound cases and 7/14 of the unbound cases. The protocol performswell on peptides from various classes of secondary structures, including coiled peptides with unusual turns and kinks. The results presented here significantly extend the scope of state-of-the-art methods for high-resolution peptide modeling, which can now be applied to a wide variety of peptide-protein interactions where no prior information about the peptide backbone conformation is available, enabling detailed structure-based studies and manipulation of those interactions.
Escherichia coli (E. coli) mazEF is a toxin-antitoxin (TA) stress-induced module that mediates cell death requiring the quorum-sensing pentapeptide NNWNN designated EDF (extracellular death factor). E. coli toxin MazF is a sequence-specific endoribonuclease cleaving single-stranded mRNAs at ACA sequences. E. coli ChpBK, a toxin homologous to MazF, is a sequence-specific endoribonuclease cleaving single-stranded mRNAs at ACA, ACG, and ACU sequences. Here we report that, in vitro, the signaling molecule EDF significantly amplifies the endoribonucleolytic activities of both MazF and ChpBK. EDF also overcomes the inhibitory activity of the antitoxins MazE over the toxin MazF and ChpBI over ChpBK. EDF sequence is important for both functions. Moreover, direct sequence-specific binding of EDF to MazF has been confirmed. Peptide-protein modeling revealed parallel contacts between EDF-MazF and MazE-MazF. These findings are intriguing, since most known quorum-sensing molecules monitor gene expression on the transcriptional level, while EDF monitors posttranscriptionally.
Dictyostelium discoideum myosin II heavy chain kinase A (MHCK A), a member of the atypical α-kinase family, phosphorylates sites in the myosin II tail that block filament assembly. Here we show that the catalytic activity of A-CAT, the α-kinase domain of MHCK A (residues 552–841), is severely inhibited by the removal of a disordered C-terminal tail sequence (C-tail; residues 806–841). The key residue in the C-tail was identified as Thr825, which was found to be constitutively autophosphorylated. Dephosphorylation of Thr825 using shrimp alkaline phosphatase decreased A-CAT activity. The activity of a truncated A-CAT lacking Thr825 could be rescued by Pi, phosphothreonine, and a phosphorylated peptide, but not by threonine, glutamic acid, aspartic acid, or an unphosphorylated peptide. These results focused attention on a Pi-binding pocket located in the C-terminal lobe of A-CAT. Mutational analysis demonstrated that the Pi-pocket was essential for A-CAT activity. Based on these results, it is proposed that autophosphorylation of Thr825 activates ACAT by providing a covalently tethered ligand for the Pi-pocket. Ab initio modeling studies using the Rosetta FloppyTail and FlexPepDock protocols showed that it is feasible for the phosphorylated Thr825 to dock intramolecularly into the Pi-pocket. Allosteric activation is predicted to involve a conformational change in Arg734, which bridges the bound Pi to Asp762 in a key active site loop. Sequence alignments indicate that a comparable regulatory mechanism is likely to be conserved in Dictyostelium MHCK B-D and metazoan eukaryotic elongation factor-2 kinases.
In this study, we assess on a large scale the possibility of deriving self-inhibitory peptides from protein domains with globular architectures. Such inhibitory peptides would inhibit interactions of their origin domain by mimicking its mode of binding to cognate partners, and could serve as promising leads for rational design of inhibitory drugs. For our large-scale analysis, we analyzed short linear segments that were cut out of protein interfaces in silico in complex structures of protein protein docking Benchmark 3.0 and CAPRI targets from rounds 1-19. Our results suggest that more than 50% of these globular interactions are dominated by one short linear segment at the domain interface, which provides more than half of the original interaction energy. Importantly, in many cases the derived peptides show strong energetic preference for their original binding mode independently of the context of their original domain, as we demonstrate by extensive computational peptide docking experiments. As an in depth case study, we computationally design a candidate peptide to inhibit the EphB4-EphrinB2 interaction based on a short peptide derived from the G-H loop in EphrinB2. Altogether, we provide an elaborate framework for the in silico selection of candidate inhibitory molecules for protein protein interactions. Such candidate molecules can be readily subjected to wet-laboratory experiments and provide highly promising starting points for subsequent drug design. Proteins 2010; 78:3140-3149. (C) 2010 Wiley-Liss, Inc.
Glycogen synthase kinase 3 beta (GSK-3 beta) is a serine-threonine kinase belonging to the CMGC family that plays a key role in many biological processes, such as glucose metabolism, cell cycle regulation, and proliferation. Like most protein kinases, GSK-3 beta is regulated via multiple pathways and sites. We performed all-atom molecular dynamics simulations on the unphosphorylated and phosphorylated unbound GSK-3 beta and the phosphorylated GSK-3 beta bound to a peptide substrate, its product, and a derived inhibitor. We found that GSK-3 beta autophosphorylation at residue Tyr(216) results in widening of the catalytic groove, thereby facilitating substrate access. In addition, we studied the interactions of the phosphorylated GSK-3 beta with a substrate and peptide inhibitor located at the active site and observed higher affinity of the inhibitor to the kinase. Furthermore, we detected a potential remote binding site which was previously identified in other kinases. In agreement with experiments we observed that binding of specific peptides at this remote site leads to stabilization of the activation loop located in the active site. We speculate that this stabilization could enhance the catalytic activity of the kinase. We point to this remote site as being structurally conserved and suggest that the allosteric phenomenon observed here may occur in the protein kinase superfamily.
A wide range of regulatory processes in the cell are mediated by flexible peptides that fold upon binding to globular proteins. Computational efforts to model these interactions are hindered by the large number of rotatable bonds in flexible peptides relative to typical ligand molecules, and the fact that different peptides assume different backbone conformations within the same binding site. In this study, we present Rosetta FlexPepDock, a novel tool for refining coarse peptide-protein models that allows significant changes in both peptide backbone and side chains. We obtain high resolution models, often of sub-angstrom backbone quality, over an extensive and general benchmark that is based on a large nonredundant dataset of 89 peptide-protein interactions. Importantly, side chains of known binding motifs are modeled particularly well, typically with atomic accuracy. In addition, our protocol has improved modeling quality for the important application of cross docking to PDZ domains. We anticipate that the ability to create high resolution models for a wide range of peptide-protein complexes will have significant impact on structure-based functional characterization, controlled manipulation of peptide interactions, and on peptide-based drug design.
Reliable high-resolution prediction of protein complex structures starting from the free monomers is a considerable challenge toward large-scale mapping of the structural details of protein-protein interactions. The current major bottleneck is to model the conformational changes of the monomer backbone upon binding. We evaluate the use of homolog structures as source for conformational diversity, within the framework of RosettaDock-a leading high-resolution docking protocol. We find that the use of homolog templates can improve significantly the modeling of a complex structure, including known difficult cases. Several conformational changes however are not sampled by any of the templates, indicating the need for additional sources of conformational variability. Interestingly, the successful homolog templates are not restricted to a confined range of sequence identity, highlighting the importance of the backbone conformation rather than the sequence.
The key step in NF-κB activation is the release of the NF-κB dimers from their inhibitory proteins, achieved via proteolysis of the IκBs. This irreversible signaling step constitutes a commitment to transcriptional activation. The signal is eventually terminated through nuclear expulsion of NF-κB, the outcome of a negative feedback loop based on IκBα transcription, synthesis, and IκBα-dependent nuclear export of NF-κB (Karin and Ben-Neriah 2000). Here, we review the process of signal-induced IκB ubiquitination and degradation by comparing the degradation of several IκBs and discussing the characteristics of IκBs’ ubiquitin machinery.
Peptide-protein interactions are very prevalent, mediating key processes such as signal transduction and protein trafficking. How can peptides overcome the entropic cost involved in switching from an unstructured, flexible peptide to a rigid, well-defined bound structure? A structure-based analysis of peptide-protein interactions unravels that most peptides do not induce conformational changes on their partner upon binding, thus minimizing the entropic cost of binding. Furthermore, peptides display interfaces that are better packed than protein-protein interfaces and contain significantly more hydrogen bonds, mainly those involving the peptide backbone. Additionally, "hot spot" residues contribute most of the binding energy. Finally, peptides tend to bind in the largest pockets available on the protein surface. Our study is based on peptiDB, a new and comprehensive data set of 103 high-resolution peptide-protein complex structures. In addition to improved understanding of peptide-protein interactions, our findings have direct implications for the structural modeling, design, and manipulation of these interactions.
Inteins are genetic elements, inserted in-frame into protein-coding genes, whose products catalyze their removal from the protein precursor via a protein-splicing reaction. Intein domains can be split into two fragments and still ligate their flanks by a trans-protein-splicing reaction. A bioinformatic analysis of environmental metagenomic data revealed 26 different loci with a novel genomic arrangement. In each locus, a conserved enzyme coding region is broken in two by a split intein, with a free-standing endonuclease gene inserted in between. Eight types of DNA synthesis and repair enzymes have this fractured organization. The new types of naturally split-inteins were analyzed in comparison to known split-inteins. Some loci include apparent gene control elements brought in with the endonuclease gene. A newly predicted homing endonuclease family, related to very-short patch repair (Vsr) endonucleases, was found in half of the loci. These putative homing endonucleases also appear in group-I introns, and as stand-alone inserts in the absence of surrounding intervening sequences. The new fractured genes organization appears to be present mainly in phage, shows how endonucleases can integrate into inteins, and may represent a missing link in the evolution of gene breaking in general, and in the creation of split-inteins in particular.
Protein folding and binding is commonly depicted as a search for the minimum energy conformation in a vast energy landscape. indeed, modelling of protein complex structures by RosettaDock often results in a set of low-energy conformations near the native structure. Ensembles of low-energy conformations can appear, however, in other regions of the energy landscape, especially when backbone movements occur upon binding. What then characterizes the energy landscape near the correct orientation? we have applied a machine learning algorithm to distinguish ensembles of low-energy conformations around the native conformation from other low-energy ensembles. FunHunt, the resulting classifier, identified the native orientation for 50/52 protein complexes in a test set, and for all of 12 recent CAPRI targets. FunHunt is also able to choose the near-native orientation among models created by algorithms other than RosettaDock, demonstrating its general applicability for model selection. The features used by FunHunt teach us about the nature of native interfaces. Remarkably, the energy decrease of trajectories toward near-native orientations is significantly larger than for other orientations. This provides a possible explanation for the stability of association in the native orientation. The FunHunt approach, discriminating models based on ensembles of structures that map the nearby energy landscape, can be adapted and extended to additional tasks, such as ob initio model selection, protein interface design and specificity predictions.
Protein folding and binding is commonly depicted as a search for the minimum energy conformation. Modeling of protein complex structures by RosettaDock often results in a set of low-energy conformations near the native structure. Ensembles of low-energy conformations can appear, however, in other regions, especially when backbone movements occur upon binding. What then characterizes the energy landscape near the correct orientation? We applied a machine learning algorithm to distinguish ensembles of low-energy conformations around the native conformation from other low-energy ensembles. The resulting classifier, FunHunt, identifies the native orientation in 50/52 protein complexes in a test set. The features used by FunHunt teach us about the nature of native interfaces. Remarkably, the energy decrease of trajectories toward near-native orientations is significantly larger than for other orientations. This provides a possible explanation for the stability of association in the native orientation.
RosettaDock has repeatedly created high-resolution structures of protein complexes in the CAPRI experiment, thanks to the explicit modeling of conformational changes of the monomers at the side chain level. These models can be selected based on their energy. During the search for the lowest-energy model, RosettaDock samples a deep funnel around the native orientation, but additional funnels may appear in the energy landscape, especially in cases where backbone conformational changes occur upon binding. We have previously developed FunHunt, a Support Vector Machine-based classifier that distinguishes the energy funnels around the native orientation from other funnels in the energy landscape. Here we assess the ability of FunHunt to help in model selection in the CAPRI experiment. For all of 12 recent CAPRI targets, FunHunt clearly identifies a near-native funnel in comparison to the funnel around the lowest energy model identified by the RosettaDock global search protocol. FunHunt is also able to choose a near-native orientation among models submitted by predictor groups, demonstrating its general applicability for model selection. This suggests that FunHunt will be a valuable tool in coming CAPRI rounds for the selection of models, and for the definition of regions that need further refinement with restricted backbone flexibility.
A challenge in protein-protein docking is to account for the conformational changes in the monomers that occur upon binding. The RosettaDock method, which incorporates sidechain flexibility but keeps the backbone fixed, was found in previous CAPRI rounds (4 and 5) to generate docking models with atomic accuracy, provided that conformational changes were mainly restricted to protein sidechains. In the recent rounds of CAPRI (6-12), large backbone conformational changes occur upon binding for several target complexes. To address these challenges, we explicitly introduced backbone flexibility in our modeling procedures by combining rigid-body docking with protein structure prediction techniques such as modeling variable hops and building homology models. Encouragingly, using this approach we were able to correctly predict a significant backbone conformational change of an interface loop for Target 20 (12 Å rmsd between those in the unbound monomer and complex structures), but accounting for backbone flexibility in protein-protein docking is still very challenging because of the significantly larger conformational space, which must be surveyed. Motivated by these CAPRI challenges, we have made progress in reformulating RosettaDock using a "fold-tree" representation, which provides a general framework for treating a wide variety of flexible-backbone docking problems.