Coherent Blending of Biophysics-Based Knowledge with Bayesian Neural Networks for Robust Protein Property Prediction

ACS synthetic biology(2023)

引用 0|浏览3
暂无评分
摘要
Predicting properties of proteins is of interest for basic biological understanding and protein engineering alike. Increasingly, machine learning (ML) approaches are being used for this task. However, the accuracy of such ML models typically degrades as test proteins stray further from the training data distribution. On the other hand, models that are more data-free, such as biophysics-based models, are typically uniformly accurate over all of the protein space, even if inferior for test points close to the training distribution. Consequently, being able to cohesively blend these two types of information within one model, as appropriate in different parts of the protein space, will improve overall importance. Herein, we tackle just this problem to yield a simple, practical, and scalable approach that can be easily implemented. In particular, we use a Bayesian formulation to integrate biophysical knowledge into neural networks. However, in doing so, a technical challenge arises: Bayesian neural networks (BNNs) enable the user to specify prior information only on the neural network weight parameters, rather than on the function values given to us from a typical biophysics-based model. Consequently, we devise a principled probabilistic method to overcome this challenge. Our approach yields intuitively pleasing results: predictions rely more heavily on the biophysical prior information when the BNN epistemic uncertaintyI-uncertainty arising from a lack of training data rather than sensor noise-is large and more heavily on the neural network when the epistemic uncertainty is small. We demonstrate this approach on an illustrative synthetic example, on two examples of protein property prediction (fluorescence and binding), and for generality on one small molecule property prediction problem.
更多
查看译文
关键词
machine learning, proteinengineering, Bayesianmethodology, biophysical models, uncertainty quantification, deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要