OligoFormer: an accurate and robust prediction method for siRNA design

Yilan Bai, Haochen Zhong, Taiwei Wang,Zhi John Lu

biorxiv(2024)

引用 0|浏览3
暂无评分
摘要
Motivation: Small interfering RNA (siRNA) has become a widely used experimental approach for post-transcriptional regulation and is increasingly showing its potential as future targeted drugs. However, the prediction of highly efficient siRNAs is still hindered by dataset biases, the inadequacy of prediction methods, and the presence of offtarget effects. To overcome these limitations, we propose a novel model, OligoFormer, for the prediction of siRNA efficacy. Results: OligoFormer comprises three different modules including thermodynamic calculation, RNA-FM module, and Oligo encoder. Taking siRNA and mRNA sequences as input, OligoFormer obtains embeddings through Oligo encoder and pre-trained language model RNA-FM[1], and combines thermodynamic parameters to predict the efficacy of siRNAs. As far as we know, OligoFormer introduces transformer encoder and RNA-FM embedding into the siRNA design model for the first time. We carefully benchmark OligoFormer with 5 comparable methods on siRNA efficacy datasets. OligoFormer outperforms all the other methods, with an average improvement of 9% in AUC and 10.7% in F1 score in our inter-dataset validation. We also provide a whole pipeline with off-target effects using PITA score and TargetScan score as the application of OligoFormer. The ablation study shows RNA-FM module and thermodynamic parameters improved the model's performance and convergence speed. The saliency map by gradient backpropagation shows some base preferences in initial and terminal region of siRNAs. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要