Exploring experimental structures and computed structure models from artificial intelligence/machine learning at RCSB Protein Data Bank (RCSB PDB, RCSB.org)

Biophysical Journal(2023)

引用 0|浏览40
暂无评分
摘要
New Artificial Intelligence/Machine Learning (AI/ML) approaches to protein structure prediction have achieved accuracies comparable to low-resolution experimental techniques. Recent breakthroughs (AlphaFold2, RoseTTAFold, etc.) have changed the way Computed Structure Models (CSMs) of proteins are being used across the life sciences. Notwithstanding more than 50,000 structures of human proteins deposited to the global Protein Data Bank (PDB), public-domain experimental coverage of the human proteome is currently only ∼17% of amino acid residues. Including CSMs from AI/ML approaches increases structural coverage of the human proteome to ∼58% of amino acid residues currently defined by experimental methods or confidently predicted by AI/ML software. RCSB PDB recently modified its infrastructure and services to support delivery of CSMs alongside experimentally-determined structures archived in the PDB. At present, the RCSB.org web portal provides open access to ∼200,000 experimentally-determined PDB structures plus >1 million CSMs. Within RCSB.org, CSMs are treated as first-class objects, fully integrated within the RCSB PDB data ecosystem. Users can search, analyze, visualize, compare, and explore PDB structures and CSMs side-by-side.The value of integrating PDB structures and CSMs on RCSB.org is as follows: (1) Many proteins of interest are not represented in the PDB. (2) CSMs provide users with structural information for full-length polypeptide chains. (3) CSMs can accelerate structure determination by 3D electron microscopy and integrative methods. (4) CSMs can be analyzed and visualized more effectively in the context of experimentally-determined PDB structures. RCSB PDB Core Operations are funded by National Science Foundation (DBI-1832184), US Department of Energy (DE-SC0019749), and National Cancer Institute, National Institute of Allergy and Infectious Diseases, and National Institute of General Medical Sciences of the National Institutes of Health under grant R01GM133198.
更多
查看译文
关键词
rcsb protein data bank,structures models,rcsb pdb,experimental structures
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要