Measuring functional similarity of lncRNAs based on variable K-mer profiles of nucleotide sequences.

Zhixia Teng, Linyue Shi, Haihao Yu,Chengyan Wu,Zhen Tian

Methods (San Diego, Calif.)(2023)

引用 1|浏览13
暂无评分
摘要
Long non-coding RNAs are a class of essential non-coding RNAs with a length of more than 200 nts. Recent studies have indicated that lncRNAs have various complex regulatory functions, which play great impacts on many fundamental biological processes. However, measuring the functional similarity between lncRNAs by traditional wet-experiments is time-consuming and labor intensive, computational-based approaches have been an effective choice to tackle this problem. Meanwhile, most sequences-based computation methods measure the functional similarity of lncRNAs with their fixed length vector representations, which could not capture the features on larger k-mers. Therefore, it is urgent to improve the predict performance of the potential regulatory functions of lncRNAs. In this study, we propose a novel approach called MFSLNC to comprehensively measure functional similarity of lncRNAs based on variable k-mer profiles of nucleotide sequences. MFSLNC employs the dictionary tree storage, which could comprehensively represent lncRNAs with long k-mers. The functional similarity between lncRNAs is evaluated by the Jaccard similarity. MFSLNC verified the similarity between two lncRNAs with the same mechanism, detecting homologous sequence pairs between human and mouse. Besides, MFSLNC is also applied to lncRNA-disease associations, combined with the association prediction model WKNKN. Moreover, we also proved that our method can more effectively calculate the similarity of lncRNAs by comparing with the classical methods based on the lncRNA-mRNA association data. The detected AUC value of prediction is 0.867, which achieves good performance in the comparison of similar models.
更多
查看译文
关键词
K-mer profiles,RNA binding sites,lncRNA functional similarity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要