Improved chromosome-level genome assembly for marigold (Tagetes erecta)

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览6
暂无评分
摘要
Abstract Marigold ( Tagetes erecta L.) is a popular ornamental plant of the Asteraceae family, and its petals are considered the most abundant source of lutein. A low-continuity chromosome-level genome sequence of marigold was published recently, with poor annotation of the protein-coding genes, which hinders the studies of lutein biosynthesis. Here, we generated a near telomere-to-telomere level genome assembly of marigold based on highly accurate high-fidelity (HiFi) long reads and Hi-C sequencing data. Compared to the previously reported marigold genome, the current assembly had obviously higher contiguity and higher completeness of gene set. The current genome assembly has a 27-fold increase in contig N50 size, a 12.1% increase in chromosome anchoring rate, and a 9.0% increase in BUSCO complete rate for the gene set. Besides, the current assembly has much fewer assembly errors. Based on this high-quality genome assembly, we found that the 170-bp repeats are the most abundant centromeric unit and all centromeric regions are distributed along the whole chromosomes for all 12 centromeres, indicating the existence of the holocentromeres in marigold. In addition, we analyzed the structure and phylogenetic relationship of the four PSY genes, and revealed that these genes have diversified and possibly executed different functions in various tissues. Our near telomere-to-telomere level genome assembly and comprehensive gene annotation will greatly facilitate the breeding of marigold and researches aimed at improving lutein production.
更多
查看译文
关键词
genome assembly,marigold,chromosome-level
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要