Revisiting Y-chromosome detection methods: R-CQ and KAMY efficiently identify Y chromosome sequences in Tephritidae insect pests

Dimitris Rallis,Konstantina T Tsoumani, Flavia Krsticevic,Philippos Aris Papathanos, Kostas D Mathiopoulos,Alexie Papanicolaou

biorxiv(2024)

引用 0|浏览4
暂无评分
摘要
Background The repetitive and heterochromatic nature of Y chromosomes poses challenges for genome assembly methods which can lead to fragmented or misassembled scaffolds. While new sequencing technologies and assembly techniques becoming popular, tools for improving the generation of an accurate Y chromosome are limited, especially for species, such as insects, with a frequent occurrence of heterochromatic chromosomes. Results Two novel Y-detection methods are presented here, R-CQ and KAMY, that revisit the ratio-based Chromosome Quotient and kmer-based Y-Genome Scan methods, respectively. We benchmark R-CQ and KAMY methods against their predecessors, over their ability in identifying Y-derived regions in genome assemblies of two important insect pests of the Tephritidae genus: the olive fruit fly Bactrocera oleae and the Mediterranean fruit fly Ceratitis capit ata. These species are characterised by different Y-chromosome morphologies and their genomes were sequenced with different methodologies. We also evaluated the efficiency and generic applicability of these methods using suitable Drosophila melanogaster genomic data, whose Y-chromosome is the best studied among insects. Furthermore, KAMY was assessed for the capability of identifying Y-derived transcripts in the absence of a reference-Y sequence and effectively identified the Tephritid maleness factor MoY in a set of mixed-sex transcriptomic data. Through our work, we describe a methodology for manually curating the computational results, through which the performance of different Y detection methods is determined, together with the size and quality of assembled Y sequences. Conclusions We find a variability in the performance of Y-detection methods, that is highly dependent on the sequencing approach used and on the sequence of the Y. Our benchmarking suggests an improved overall efficiency of KAMY and R-CQ compared to their predecessors, while our analysis highlights the importance for manually curating the algorithmic outputs towards describing the accuracy and quality of identified Y sequences. Based on our results, we provide some recommendations for future sequencing efforts in insects to best support downstream Y assembly steps. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要