Abstract C118: Design of Programmable Peptide-Guided Oncoprotein Degraders Via Generative Language Models
Molecular cancer therapeutics(2023)SCI 2区
1Duke University
Abstract
Abstract Targeted protein degradation of pathogenic proteins represents a powerful new treatment strategy for multiple cancers. Unfortunately, a sizable portion of these proteins are considered “undruggable” by standard small molecule-based approaches, including PROTACs and molecular glues, largely due to their disordered nature, instability, and lack of binding site accessibility. As a more modular strategy, we have developed a genetically-encoded protein architecture by fusing target-specific peptides to E3 ubiquitin ligase domains for selective and potent intracellular degradation of oncoproteins. To enable programmability of our system, we develop a suite of algorithms that enable the design of target-specific peptides via protein language model (pLM) embeddings, without the requirement of 3D structures. First, we train a model that leverages pLM embeddings to efficiently select high-affinity peptides from natural protein interaction interfaces. Next, we develop a high-accuracy discriminator, based on the contrastive language-image pretraining (CLIP) architecture underlying OpenAI's DALL-E model, to prioritize and screen peptides with selectivity to a specified target oncoprotein. As input to the discriminator, we create a Gaussian diffusion generator to sample a pLM latent space, fine-tuned on experimentally-valid peptide sequences. Finally, to enable de novo design of binding peptides, we train an instance of GPT-2 with protein interacting sequences to enable peptide generation conditioned on target oncoprotein sequences. Our models demonstrate low perplexities across both existing and generated peptide sequences, highlighting their robust generative capability. By experimentally fusing model-derived peptides to E3 ubiquitin ligase domains, we reliably identify candidates exhibiting robust and selective endogenous degradation of diverse, "undruggable" oncoproteins in cancer cell models, including tumorigenic regulators such as β-catenin and TRIM8, as well as oncogenic fusion proteins, such as EWS-FLI1, PAX3-FOXO1, and DNAJB1-PRKACA. We further show that our peptide-guided degraders have negligible off-target effects via whole-cell proteomics and demonstrate their modulation of transcriptional and apoptotic pathways, motivating further translation of our therapeutic platform. Together, our work establishes a CRISPR-analogous system for programmable protein degradation applications across the oncoproteome. Citation Format: Suhaas Bhat, Garyk Brixi, Kalyan Palepu, Lauren Hong, Vivian Yudistyra, Tianlai Chen, Sophia Vincoff, Lin Zhao, Pranam Chatterjee. Design of programmable peptide-guided oncoprotein degraders via generative language models [abstract]. In: Proceedings of the AACR-NCI-EORTC Virtual International Conference on Molecular Targets and Cancer Therapeutics; 2023 Oct 11-15; Boston, MA. Philadelphia (PA): AACR; Mol Cancer Ther 2023;22(12 Suppl):Abstract nr C118.
MoreTranslated text
求助PDF
上传PDF
View via Publisher
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
- Pretraining has recently greatly promoted the development of natural language processing (NLP)
- We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
- We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
- The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
- Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Upload PDF to Generate Summary
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper