Transfer learning for Heterocycle Synthesis Prediction

Ewa Wieczorek, Joshua W. Sin, Matthew T. O. Holland, Liam Wilbraham, Victor S. Perez,Anthony Bradley, Dominik Miketa,Paul E. Brennan,Fernanda Duarte

crossref(2024)

引用 0|浏览0
暂无评分
摘要
Heterocycles are important scaffolds in medicinal chemistry that can be used to modulate the binding mode as well as pharmacokinetic properties of drugs. The importance of heterocycles has been exemplified by the publication of numerous datasets containing heterocyclic rings and their properties. However, those datasets lack synthetic routes towards the published heterocycles. Consequently, novel and uncommon heterocycles are not easily synthetically accessible. While retrosynthetic prediction models could usually be used to assist synthetic chemists, their performance is poor for heterocycle formation reactions due to low data availability. In this work, we compare the use of four different transfer learning methods to overcome the low data availability problem and improve the performance of retrosynthesis prediction models for ring-breaking disconnections. The mixed fine-tuned model achieves top-1 accuracy of 36.5% and, moreover, 62.1% of its predictions are chemically valid and ring-breaking. Furthermore, we demonstrate the applicability of the mixed fine-tuned model in drug discovery by recreating synthetic routes towards two drug-like targets published this year. Finally, we introduce a method for further fine-tuning the model as new reaction data becomes available.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要