A large-scale behavior of allelic dropout and imbalance caused by DNA methylation changes in an early-ripening bud sport of peach

NEW PHYTOLOGIST(2023)

引用 2|浏览14
暂无评分
摘要
Perennial fruit crops are characterized by long juvenile period and generation time, high levels of heterozygosity, and self-incompatibility. To maintain their genotypes, fruit crops are propagated asexually through the use of vegetative parts such as bud. Therefore, bud sports with novel characteristics have been extensively selected to develop new cultivars although the molecular basis of their mutations remains largely unknown (Foster & Aranzana, 2018). Bud sports associated with changes in the DNA sequence have been identified in several fruit trees (Boss & Thomas, 2002; Kobayashi et al., 2004; Yang et al., 2022). However, there are few reports on bud sports caused by epigenetic modification in fruit trees. Here, we report an association of early-ripening bud mutation with DNA methylation changes in a 17-Mb chromosomal region that displays a large-scale behavior of allelic dropout (ADO) and allele-specific expression (ASE) in peach. Peach (Prunus persica) is one of the most economically important temperate fruit tree species. It is a typical climacteric fruit with a short shelf life; thus, the improvement in fruit maturity date (MD) is one of the main objectives in breeding programs. Although bud-sport selection has been well used to improve fruit MD in perennial fruit crops, the molecular basis of MD-related bud sports remains largely unknown. Recently, we identified an early-ripening bud sport designated ‘Li Xia Hong (LXH)’ from a nectarine cultivar ‘Zhong You 4 (ZY4)’, with the bud sport ripening 16 d earlier than its parent (Fig. 1a; Supporting Information Methods S1). To investigate genetic basis of early ripening in ‘LXH’, we sequenced the genome of ‘LXH’ and ‘ZY4’ using Illumina sequencing platform. Overall, 8.71 and 10.52 Gb of high-quality clean reads at 29- and 35-fold depths were generated for ‘ZY4’ and ‘LXH’, respectively (Table S1). Approximately 95% of the reads for both accessions were mapped to the peach reference genome v.2.0 (Verde et al., 2017). Comparison of DNA sequences between ‘ZY4’ and ‘LXH’ revealed 96 967 single-nucleotide polymorphisms (SNPs), which could be classified into three types, I, II, and III. Type I SNPs were homozygous in ‘LXH’ and heterozygous in ‘ZY4’, while type II SNPs had the homozygous genotype in ‘ZY4’ and the heterozygous genotype in ‘LXH’. Type III SNPs were in a homozygous state in both ‘LXH’ and ‘ZY4’. Of the 96 967 SNPs, 66 343 (68.4%), 30 110 (31.1%), and 514 (0.5%) belonged to type I (Table S2), type II (Table S3), and type III (Table S4), respectively. Analysis of SNP distribution across the genome revealed that chromosomes (Chrs) 1, 2, 3, 5, 6, 7, and 8 all contained one SNP hotspot with c. 1 Mb in size (Fig. 1b). Interestingly, SNP enrichment was observed across the entire Chr4 that contained 46.2% of total SNPs. Of the SNPs on Chr4, 82.0% were concentrated in a region from 0 to 17 Mb. For ease of description, the upper half of Chr4 from 0 to 17 Mb and the bottom half from 17 to 26 Mb were designated Chr4-UH and Chr4-BH, respectively. The amount of SNPs in the Chr4-UH region was significantly higher than in the Chr4-BH region (Fig. 1c). Notably, the predominant type of SNPs in the Chr4-UH region was type I with a high frequency of 95.4%, whereas the frequencies of type II and type III SNPs were 4.3% and 0.3%, respectively. By contrast, type I and type II SNPs were both predominant in the Chr4-BH region, with frequencies of 48.9% and 51.0%, respectively. Similarly, SNP hotspots on other chromosomes consisted mainly of type I and type II SNPs, with nearly equal frequencies. The distribution of type II SNPs across the entire Chr4 suggested that the high enrichment of SNPs in the Chr4-UH region was unlikely caused by chromosome elimination. Then, we manually checked SNPs in the Chr4-UH region using the Integrated Genomics Viewer (IGV). As a result, a certain amount of type I-SNP loci were found to contain two alleles in ‘LXH’ although one allele occurred with an extremely low frequency (Fig. S1). To validate this finding, we examined four randomly selected type I SNPs that contained two alleles based on IGV assay using a PCR-direct Sanger sequencing method. Consistent with the IGV results, two peaks of each SNP locus were observed for both accessions, with one peak being very weak in intensity in ‘LXH’ (Fig. S2). These results indicate that SNP enrichment in the Chr4-UH region was likely caused by the amplification failure of one of the two alleles for heterozygote DNA template in ‘LXH’, a phenomenon termed allelic dropout (Stevens et al., 2017). Similarly, IGV analysis revealed that SNP hotspots on other chromosomal regions were associated with ADO (Fig. S3). As mentioned above, type I and type II SNPs had similar frequency in SNP-enriched regions on the bottom half of Chr4 and other chromosomes, which suggested that ADO occurred randomly in the peach genome except the Chr4-UH region, with similar frequency in ‘LXH’ and ‘ZY4’. DNA methylation is one of the main causes of allele dropout (Reinius & Sandberg, 2015). Since DNA methylation has the potential to alter gene expression, we compared the fruit transcriptome between the bud sport and parent. Transcriptome sequencing was conducted for fruits of ‘LXH’ and ‘ZY4’ at 5 and 7 stages shown in Fig. 1a, respectively, resulting in a total of 197 Gb of clean reads (Table S5). To facilitate genome-wide identification of expressed SNPs (eSNPs), clean reads of fruit samples from the same accession were merged and mapped to the peach reference genome v.2.0 (Verde et al., 2017). Comparison of transcriptome sequences between ‘ZY4’ and ‘LXH’ revealed 4127 eSNPs. Surprisingly, 3518 (85.2%) out of the total eSNPs were concentrated in the Chr4-UH region, while only 28 (0.7%) and 581 (14.1%) eSNPs were located in the Chr4-BH region and other chromosomes, respectively (Fig. 1d). The eSNP density in the Chr4-UH region was significantly higher than in other chromosomal regions (Fig. 1e). According to the definition of SNP types mentioned above, all eSNPs were divided into three types. Of the 4127 eSNPs, 3798 (92.0%, Table S6), 306 (7.4%, Table S7), and 23 (0.6%, Table S8) belonged to type I, type II, and type III, respectively. Among the 3798 type I eSNPs, 3443 (90.7%), 17 (0.4%), and 338 (8.9%) were located in the Chr4-UH region, the Chr4-BH region, and other chromosomes, respectively. The enrichment of type I eSNPs in the Chr4-BH region indicated a large-scale behavior of loss of heterozygosity in the bud sport, a phenomenon termed ASE or allelic imbalance (Lu et al., 2021). However, ASE occurred occasionally in other chromosomal regions in the bud sport. Likewise, only 306 type II eSNPs were detected across the genome, suggesting an occasional occurrence of ASE in ‘ZY4’. Additionally, the Chr4-UH region consisted of 2524 genes, with 1842 being expressed in fruits of ‘ZY4’ and ‘LXH’ (Table S9). Of these expressed genes, 809 (43.9%, Table S10) harbored the type I eSNPs and were thus deemed to have the ASE behavior. Since DNA methylation is one of the main reasons for ASE (Satyaki & Gehring, 2017; da Rocha & Gendrel, 2019), DNA methylation patterns in fruits of ‘ZY4’ and ‘LXH’ were investigated using Whole Genome Bisulfite Sequencing (WGBS). To determine the developmental stage suitable for DNA methylation analysis, we analyzed fruit transcriptomes of ‘ZY4’ and ‘LXH’ using principal component analysis (PCA). Transcriptomes of fruits of both accessions at 15 or 31 d after full bloom (DAFB) were clustered together, but separated from each other at 47 DAFB (Fig. 2a). Moreover, the numbers of differentially expressed genes (DEGs) between these two accessions were 470 and 1388 at 15 and 31 DAFB, respectively, but over 3000 at 47, 55 and 63 DAFB (Fig. 2b). These results suggested that the time point of 31 DAFB represented the initiation phase of divergence in fruit development between ‘ZY4’ and ‘LXH’. Thus, fruit samples of ‘ZY4’ and ‘LXH’ at 31 DAFB were selected for WGBS analysis. Comparison of DNA methylation status between ‘ZY4’ and ‘LXH’ revealed 1978 differentially methylated regions (DMRs), including 1618 (81.8%) CpG-based DMRs, 351 (17.7%) CHG-based DMRs, and 9 (0.5%) CHH-based DMRs (Table S11). Of the 1618 CpG-based DMRs, 1483 (91.7%) were concentrated in the Chr4-UH region (Fig. 1f). Likewise, 237 (67.5%) out of the 351 CHG-based DMRs were enriched in the Chr4-UH region. The densities of CpG-based and CHG-based DMRs in the Chr4-UH region were both significantly higher than in other chromosomal regions (Fig. 1g). To verify the results of DMRs, McrBC-qPCR was conducted for eight randomly selected DMRs that showed higher DNA methylation level in ‘ZY4’ than in ‘LXH’. As expected, the results were consistent with the WGBS analysis (Fig. S4). Notably, hyper- and hypo-DMRs were present in almost equal abundance and alternately distributed in the Chr4-UH region (Fig. S5). Linear regression analysis indicated that the frequency of ADO-related type I SNPs in ‘LXH’ at the genome-wide level was significantly correlated with the densities of CpG-based (r = 0.73, P < 0.001) and CHG-based DMRs (r = 0.29, P < 0.001). Likewise, the frequency of ASE-related type I eSNPs in ‘LXH’ was significantly correlated with the densities of CpG-based (r = 0.71, P < 0.001) and CHG-based DMRs (r = 0.41, P < 0.001). Additionally, genome-wide structural variations (SVs) were similar between ‘ZY4’ and ‘LXH’ (Fig. S6), suggesting SVs had no contribution to the ASE behavior. Therefore, these results suggested that DNA methylation changes in the 17-Mb fragment are likely responsible for the large-scale behavior of ADO and ASE in the bud sport. To our knowledge, we report for the first time the large-scale behavior of ADO and ASE caused by changes in DNA methylation pattern in organisms. The large-scale behavior of ADO is in contrast to previous findings that ADO is restricted to specific loci such as imprinted loci in mammals (Burger et al., 2007; Stevens et al., 2017). Likewise, the large-scale behavior of ASE is different from previous reports that ASE genes are randomly distributed across the genome in mammals (Gaulton et al., 2010; Crowley et al., 2015) and plants (von Korff et al., 2009; Dong et al., 2017; Albert et al., 2018; Shao et al., 2019). Moreover, 6 and 25 genes encoding DNA methyltransferase and demethylation-related factors, respectively, were identified in the peach genome (Table S12), but they all had similar expression in fruits at 31 DAFB between ‘ZY4’ and ‘LXH’. Thus, the large-scale modification of DNA methylation in the 17-Mb fragment could be controlled by a yet-unknown regulator that has undergone mutation in the bud sport, consistent with the report of ASE induction by nonimprinting factors in mammals (Chess, 2016). Notably, the 17-Mb fragment contains the large-effect quantitative trait locus (QTL) for MD in which two candidate genes PpNAC1 (Prupe.4G187100) and PpNAC5 (Prupe.4G186800) have been reported (Pirona et al., 2013; Lü et al., 2018). PpNAC1 and PpNAC5 were both among the DEGs at 31 DAFB, with up- and downregulation in ‘LXH’, respectively (Table S13). PpNAC1 was highly expressed in fruits of ‘LXH’ at exponential growth and ripening stages and early-ripening cultivars at 31 DAFB, but its expression was very low in later ripening cultivars (Fig. 2c,d), indicating a regulatory role of PpNAC1 in MD. Moreover, no DNA sequence variation at PpNAC1 and PpNAC5 was detected between ‘ZY4’ and ‘LXH’ based on Illumina sequencing data. However, DNA methylation level in the coding and/or rear regions of PpNAC1 and PpNAC5 showed difference between ‘ZY4’ and ‘LXH’ (Figs 2e, S7). These results suggested that its change in DNA methylation status is likely associated with the early-ripening bud mutation. Bud sport is nearly identical to its parent in genetic background, and thus, it is a valuable material for investigating the molecular basis of horticultural traits (Foster & Aranzana, 2018). With the development of high-throughput sequencing technologies, many attempts have been made to identify the causal gene and mutation in bud sport through comparative genomic analysis. However, comparison of the genomes and transcriptomes of the bud sport and parent usually results in thousands of SNPs and DEGs. Here, our results suggest that ADO and ASE contribute to the detection of numerous SNPs and DEGS, respectively, between the bud sport and parent. In addition, ADO is a common phenomenon that causes erroneous assignment of heterozygous genotypes as homozygotes and thus represents an important source of genotyping error and disease misdiagnosis (Wang et al., 2012; Gautier et al., 2013). Our results point out the substantial negative impact of ADO caused by epigenetic modifications on whole-genome sequencing-based genetic studies in woody perennial crops with a high degree of heterozygosity. This project was financially supported by the Special Fund for Strategic Pilot Technology of the Chinese Academy of Sciences (XDA24030404-4), the Natural Science Foundation of Anhui Province (2108085MC106), the Agriculture Research System of Anhui Province (AHNYCYTX-10), the China Agriculture Research System (CARS-30), and Hubei Hongshan Laboratory (2021hszd017). None declared. HZ and YH planned and designed the experiments. HZ, YS, KQ, PS and QX performed the experiments. HP and JZ contributed experimental materials. LL, HZ and WZ performed data analysis. YH and HZ wrote the manuscript. The sequencing data generated in this study have been deposited in NCBI SRA database with the Bio-Project accession number: PRJNA939397. Fig. S1 Example for the Integrated Genomics Viewer image of 12 single-nucleotide polymorphisms between ‘ZY4’ and ‘LXH’ that were identified in the Chr4-UH region based on Illumina sequencing. Fig. S2 Validation of four randomly selected single-nucleotide polymorphisms showing the behavior of allelic dropout in the Chr4-UH region using a PCR-direct Sanger sequencing method. Fig. S3 Example for the Integrated Genomics Viewer image of single-nucleotide polymorphisms between ‘ZY4’ and ‘LXH’ that were identified in the chromosomes other than Chr4 based on Illumina sequencing. Fig. S4 Validation of 12 differentially methylated regions randomly selected from the Chr4-UH region using McrBC-qPCR. Fig. S5 Distribution of hyper- and hypo-differentially methylated regions in the peach genome. Fig. S6 Comparison of structural variations at the genome-wide level between ‘ZY4’ and ‘LXH’. Whole-genome sequencing data were generated using Oxford Nanopore Technologies. Fig. S7 Comparison of DNA methylation at the PpNAC5 locus between ‘ZY4’ and ‘LXH’. Methods S1 Detailed description of Materials and Methods. Table S1 Overview of whole-genome resequencing data. Table S2 Type I single-nucleotide polymorphisms between ‘LXH’ and ‘ZY4’. Table S3 Type II single-nucleotide polymorphisms between ‘LXH’ and ‘ZY4’. Table S4 Type III single-nucleotide polymorphisms between ‘LXH’ and ‘ZY4’. Table S5 Overview of the RNA-Seq data. Table S6 Type I eSNPs between ‘LXH’ and ‘ZY4’. Table S7 Type II eSNPs between ‘LXH’ and ‘ZY4’. Table S8 Type III eSNPs between ‘LXH’ and ‘ZY4’. Table S9 Genes in the Chr4-UH region and their expression in fruit samples of ‘ZY40’ and ‘LXH’ at different developmental stages. Table S10 List of genes showing the behavior of ASE in the Chr4-UH region. Table S11 Number of differentially methylated regions between ‘ZY4’ and ‘LXH’. Table S12 Genes encoding DNA methyltransferase and demethylation-related factors and their expression in fruits of ‘ZY4’and ‘LXH’ at 31 d after full bloom. Table S13 Functional annotation of differentially expressed genes between fruits of ‘ZY4’ and ‘LXH’ at 31 d after full bloom. Table S14 Primers used in this study. Please note: Wiley is not responsible for the content or functionality of any Supporting Information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
更多
查看译文
关键词
dna methylation changes,allelic dropout
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要