CEMIG: Prediction of the cis-regulatory motif using the De Bruijn graph from ATAC-seq

biorxiv(2023)

引用 0|浏览0
暂无评分
摘要
Sequence motif discovery algorithms identify novel DNA patterns with significant biological roles, such as transcription factor (TF) binding site motifs. Chromatin accessibility data, accumulated through assay for transposase-accessible chromatin with sequencing (ATAC-seq), has enriched resources for motif discovery. However, computational efforts in ATAC-seq data analysis mainly target TF binding activity footprinting rather than motif prediction. Here, we introduce CEMIG, an algorithm predicting and characterizing TF binding sites, leveraging the De Bruijn and Hamming distance graph models. Evaluation of 129 ATAC-seq datasets from the Cistrome Data Browser suggests that CEMIG outperforms three widely used methods using four metrics. It is noteworthy that CEMIG is employed to predict cell-type-specific and shared TF motifs in GM12878 and K562 cells, facilitating comprehensive gene expression and functional genomics analysis. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要