Identification of DNA N4-methylcytosine Sites via Multiview Kernel Sparse Representation Model

IEEE Transactions on Artificial Intelligence(2023)

引用 3|浏览2
暂无评分
摘要
Identifying DNA N4-methylcytosine (4mC) sites is of great significance in biological research, such as chromatin structure, DNA stability, DNA–protein interaction, and controlling gene expression. However, the traditional sequencing technology to identify 4mC sites is very time-consuming. In order to detect 4mC sites, we develop a multiview learning method for achieving more effectively via merging multiple feature spaces. Furthermore, we think about whether the multiview learning method can improve the across species classification ability by fusing data of multiple species. In our study, we propose a multiview Laplacian kernel sparse representation-based classifier, called MvLapKSRC-HSIC. First, we make use of three feature extraction methods [position-specific trinucleotide propensity, nucleotide chemical property, and DNA physicochemical properties) to extract the DNA sequence features. MvLapKSRC-HSIC uses a kernel sparse representation-based classifier with graph regularization. In order to maintain the independence between various views, we add a multiview regularization term constructed by Hilbert–Schmidt independence criterion (HSIC). In the experiments, MvLapKSRC-HSIC is applied on six datasets, so as to compare with other popular methods in single-species and cross-species experiments. All experimental results show that MvLapKSRC-HSIC is superior to other outstanding methods on both single species and cross species. Importantly, MvLapKSRC-HSIC can identify a series of potential DNA 4mC sites, which have not yet been experimentally evaluate on multiple species and merit further research.
更多
查看译文
关键词
DNA N4-methylcytosine (4mC) sites,kernel method,multiview learning,sequence classification,sparse representation-based classifier (SRC)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要