MotifScope: a multi-sample motif discovery and visualization tool for tandem repeats

biorxiv(2024)

引用 0|浏览2
暂无评分
摘要
Tandem repeats (TRs) constitute a significant portion of the human genome, exhibiting high levels of polymorphism due to variations in size and motif composition. These variations have been associated with various neuropathological disorders, underscoring the clinical importance of TRs. Furthermore, the motif structure of these repeats can offer valuable insights into evolutionary dynamics and population structure. However, analysis of TRs has been hampered by the limitations of short-read sequencing technology, which lacks the ability to fully capture the complexity of these sequences. With long-read data becoming more accessible, there is now also a need for tools to explore and characterize these TRs. In this study, we introduce MotifScope, a novel algorithm for visualization of TRs in their population context based on a de novo k-mer approach for motif discovery. Comparative analysis against three established tools, uTR, TRF, and vamos, reveals that MotifScope can identify a greater number of motifs and more accurately represent the actual repeat sequence. Additionally, MotifScope enables comparison of sequencing reads within an individual and assemblies across different individuals, showing its applicability in diverse genomic contexts. We demonstrate potential applications of MotifScope in diverse fields, including population genetics, clinical settings, and forensic analyses. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要