Guide to k-mer approaches for genomics across the tree of life
arxiv(2024)
摘要
The wide array of currently available genomes display a wonderful diversity
in size, composition and structure with many more to come thanks to several
global biodiversity genomics initiatives starting in recent years. However,
sequencing of genomes, even with all the recent advances, can still be
challenging for both technical (e.g. small physical size, contaminated samples,
or access to appropriate sequencing platforms) and biological reasons (e.g.
germline restricted DNA, variable ploidy levels, sex chromosomes, or very large
genomes). In recent years, k-mer-based techniques have become popular to
overcome some of these challenges. They are based on the simple process of
dividing the analysed sequences (e.g. raw reads or genomes) into a set of
sub-sequences of length k, called k-mers. Despite this apparent simplicity,
k-mer-based analysis allows for a rapid and intuitive assessment of complex
sequencing datasets. Here, we provide the first comprehensive review to the
theoretical properties and practical applications of k-mers in biodiversity
genomics, serving as a reference manual for this powerful approach.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要