Improving Antibody Humanness Prediction using Patent Data
CoRR(2024)
摘要
We investigate the potential of patent data for improving the antibody
humanness prediction using a multi-stage, multi-loss training process.
Humanness serves as a proxy for the immunogenic response to antibody
therapeutics, one of the major causes of attrition in drug discovery and a
challenging obstacle for their use in clinical settings. We pose the initial
learning stage as a weakly-supervised contrastive-learning problem, where each
antibody sequence is associated with possibly multiple identifiers of function
and the objective is to learn an encoder that groups them according to their
patented properties. We then freeze a part of the contrastive encoder and
continue training it on the patent data using the cross-entropy loss to predict
the humanness score of a given antibody sequence. We illustrate the utility of
the patent data and our approach by performing inference on three different
immunogenicity datasets, unseen during training. Our empirical results
demonstrate that the learned model consistently outperforms the alternative
baselines and establishes new state-of-the-art on five out of six inference
tasks, irrespective of the used metric.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要