Tám bộ gen chất lượng cao tiết lộ kiến trúc pan-genome và sự phân hóa sinh thái của Brassica napus

Nature Plants - Tập 6 Số 1 - Trang 34-45
Jia‐Ming Song1, Zhilin Guan1, Jianlin Hu1, Chaocheng Guo1, Zhiquan Yang1, Shuo Wang1, Dongxu Liu1, Bo Wang1, Shaoping Lu1, Run Zhou1, Wen‐Zhao Xie1, Yuanfang Cheng1, Shouxin Zhang1, Kede Liu1, Qingyong Yang1, Ling‐Ling Chen1, Liang Guo1
1National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, People’s Republic of China

Tóm tắt

Tóm tắt

Rape (Brassica napus) là cây trồng hạt dầu quan trọng thứ hai trên thế giới, nhưng sự đa dạng di truyền dưới nền tảng của các biến thể kiểu hình phong phú vẫn chưa được khám phá nhiều. Trong nghiên cứu này, chúng tôi báo cáo việc giải trình tự, lắp ráp de novo và chú thích tám dòng B. napus. Sử dụng phân tích so sánh pan-genome, hàng triệu biến thể nhỏ và biến thể sự hiện diện và vắng mặt (PAVs) từ 77,2 đến 149,6 megabase đã được xác định. Hơn 9,4% số gene chứa các đột biến lớn hoặc biến thể cấu trúc. Nghiên cứu liên kết toàn bộ genome dựa trên PAV (PAV-GWAS) đã xác định trực tiếp các biến thể cấu trúc nguyên nhân cho chiều dài quả, trọng lượng hạt và thời gian ra hoa trong một quần thể bản đồ phối hợp có ZS11 (dòng tham khảo) là dòng tổ tiên, mà không được phát hiện qua nghiên cứu GWAS dựa trên biến thể đơn nucleotide (SNP-GWAS), cho thấy PAV-GWAS là bổ sung cho SNP-GWAS trong việc xác định các liên kết đến các đặc điểm. Phân tích thêm cho thấy rằng PAVs trong ba gene FLOWERING LOCUS C có liên quan chặt chẽ đến thời gian ra hoa và sự phân hóa sinh thái. Nghiên cứu này cung cấp tài nguyên để hỗ trợ việc hiểu rõ hơn về kiến trúc genome và tăng tốc cải thiện gen của B. napus.

Từ khóa


Tài liệu tham khảo

Wang, B. et al. Dissection of the genetic architecture of three seed-quality traits and consequences for breeding in Brassica napus. Plant Biotechnol. J. 16, 1336–1348 (2018).

Chalhoub, B. et al. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science 345, 950–953 (2014).

Lu, K. et al. Whole-genome resequencing reveals Brassica napus origin and genetic loci involved in its improvement. Nat. Commun. 10, 1154 (2019).

Sun, F. et al. The high-quality genome of Brassica napus cultivar ‘ZS11’ reveals the introgression history in semi-winter morphotype. Plant J. 92, 452–468 (2017).

Zou, J. et al. Genome-wide selection footprints and deleterious variations in young Asian allotetraploid rapeseed. Plant Biotechnol. J. 17, 1998–2010 (2019).

Qian, W. et al. Heterotic patterns in rapeseed (Brassica napus L.): I. Crosses between spring and Chinese semi-winter lines. Theor. Appl. Genet. 115, 27–34 (2007).

Qian, W. et al. Introgression of genomic components from Chinese Brassica rapa contributes to widening the genetic diversity in rapeseed (B. napus L.), with emphasis on the evolution of Chinese rapeseed. Theor. Appl. Genet. 113, 49–54 (2006).

Bayer, P. E. et al. Assembly and comparison of two closely related Brassica napus genomes. Plant Biotechnol. J. 15, 1602–1610 (2017).

Li, Y. H. et al. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat. Biotechnol. 32, 1045–1052 (2014).

Tao, Y., Zhao, X., Mace, E., Henry, R. & Jordan, D. Exploring and exploiting pan-genomics for crop improvement. Mol. Plant 12, 156–169 (2019).

Tettelin, H. et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proc. Natl Acad. Sci. USA 102, 13950–13955 (2005).

Lu, F. et al. High-resolution genetic mapping of maize pan-genome sequence anchors. Nat. Commun. 6, 6914 (2015).

Hurgobin, B. et al. Homoeologous exchange is a major cause of gene presence/absence variation in the amphidiploid Brassica napus. Plant Biotechnol. J. 16, 1265–1274 (2018).

Zhao, Q. et al. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nat. Genet. 50, 278–284 (2018).

Chen, B. Y., Heneen, W. K. & Jonsson, R. Resynthesis of Brassica napus L. through interspecific hybridization between B. alboglabra Bailey and B. campestris L. with special emphasis on seed colour. Plant Breed. 101, 52–59 (1988).

Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

Johnston, J. S. et al. Evolution of genome size in Brassicaceae. Ann. Bot. 95, 229–235 (2005).

Simão, F. A. et al. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).

Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007).

Koch, M. A., Haubold, B. & Mitchell-Olds, T. Comparative evolutionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassicaceae). Mol. Biol. Evol. 17, 1483–1498 (2000).

Zhao, M. et al. Shifts in the evolutionary rate and intensity of purifying selection between two Brassica genomes revealed by analyses of orthologous transposons and relics of a whole genome triplication. Plant J. 76, 211–222 (2013).

Liu, S. et al. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat. Commun. 5, 3930 (2014).

Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).

Tiang, C. L., He, Y. & Pawlowski, W. P. Chromosome organization and dynamics during interphase, mitosis, and meiosis in plants. Plant Physiol. 158, 26–34 (2012).

Inaba, R. & Nishio, T. Phylogenetic analysis of Brassiceae based on the nucleotide sequences of the S‐locus related gene, SLR1. Theor. Appl. Genet. 105, 1159–1165 (2002).

Lysak, M. A., Mandakova, T. & Schranz, M. E. Comparative paleogenomics of crucifers: ancestral genomic blocks revisited. Curr. Opin. Plant Biol. 30, 108–115 (2016).

Duan, W. et al. Genome-wide analysis of the MADS-box gene family in Brassica rapa (Chinese cabbage). Mol. Genet. Genomics 290, 239–255 (2014).

Zhao, Q., Weber, A. L., McMullen, M. D., Guill, K. & Doebley, J. MADS-box genes of maize: frequent targets of selection during domestication. Genet. Res. 93, 65–75 (2011).

Cheng, F. et al. Sub-genome parallel selection is associated with morphotype diversification and convergent crop domestication in Brassica rapa and Brassica oleracea. Nat. Genet. 48, 1218–1224 (2016).

Yang, J. et al. The genome sequence of allopolyploid Brassica juncea and analysis of differential homoeolog gene expression influencing selection. Nat. Genet. 48, 1225–1232 (2016).

Harper, A. L. et al. Associative transcriptomics of traits in the polyploid crop species Brassica napus. Nat. Biotechnol. 30, 798 (2012).

Hirsch, C. N. et al. Insights into the maize pan-genome and pan-transcriptome. Plant Cell 26, 121–135 (2014).

Golicz, A. A. et al. The pangenome of an agronomically important crop plant Brassica oleracea. Nat. Commun. 7, 13390 (2016).

Vesth, T. C. et al. Investigation of inter- and intraspecies variation through genome sequencing of Aspergillus section Nigri. Nat. Genet. 50, 1688–1695 (2018).

Li, L., Stoeckert, C. J. & Roos, D. S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003).

Yang, N. et al. Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement. Nat. Genet. 51, 1052–1059 (2019).

Gaut, B. S., Seymour, D. K., Liu, Q. & Zhou, Y. Demography and its effects on genomic variation in crop domestication. Nat. Plants 4, 512–520 (2018).

Lye, Z. N. & Purugganan, M. D. Copy number variation in domestication. Trends Plant Sci. 24, 352–365 (2019).

Liu, J. et al. Natural variation in ARF18 gene simultaneously affects seed weight and silique length in polyploid rapeseed. Proc. Natl Acad. Sci. USA 112, E5123–E5132 (2015).

Li, S. P. et al. BnaC9.SMG7b functions as a positive regulator of the number of seeds per silique in Brassica napus by regulating the formation of functional female gametophytes. Plant Physiol. 169, 2744–2760 (2015).

Hu, J. et al. Genetic properties of a nested association mapping population constructed with semi-winter and spring oilseed rapes. Front. Plant Sci. 9, 1740 (2018).

Sibbesen, J. A., Maretty, L., Danish Pan-Genome Consortium. & Krogh, A. Accurate genotyping across variant classes and lengths using variant graphs. Nat. Genet. 50, 1054–1059 (2018).

Yang, P. Identification of a major QTL for silique length and seed weight in oilseed rape (Brassica napus L.). Theor. Appl. Genet. 125, 285–296 (2012).

Shi, L. et al. A CACTA‐like transposable element in the upstream region of BnaA9. CYP 78A9 acts as an enhancer to increase silique length and seed weight in rapeseed. Plant J. 98, 524–539 (2019).

Wu, D. et al. Whole-genome resequencing of a worldwide collection of rapeseed accessions reveals the genetic basis of ecotype divergence. Mol. Plant 12, 30–43 (2019).

Helliwell, C. A., Wood, C. C., Robertson, M., James Peacock, W. & Dennis, E. S. The Arabidopsis FLC protein interacts directly in vivo with SOC1 and FT chromatin and is part of a high-molecular-weight protein complex. Plant J. 46, 183–192 (2006).

Tadege, M. et al. Control of flowering time by FLC orthologues in Brassica napus. Plant J. 28, 545–553 (2001).

Hou, J. et al. A Tourist-like MITE insertion in the upstream region of the BnFLC. A10 gene is associated with vernalization requirement in rapeseed (Brassica napus L.). BMC Plant Biol. 12, 238 (2012).

Yi, L. et al. Sequence variation and functional analysis of a FRIGIDA orthologue (BnaA3. FRI) in Brassica napus. BMC Plant Biol. 18, 32 (2018).

Gan, X. et al. Multiple reference genomes and transcriptomes for Arabidopsis thaliana. Nature 477, 419–423 (2011).

Wang, W. et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature 557, 43–49 (2018).

Montenegro, J. D. et al. The pangenome of hexaploid bread wheat. Plant J. 90, 1007–1013 (2017).

Xia, S. et al. Altered transcription and neofunctionalization of duplicated genes rescue the harmful effects of a chimeric gene in Brassica napus. Plant Cell 28, 2060–2078 (2016).

Xin, Q. et al. MS5 mediates early meiotic progression and its natural variants may have applications for hybrid production in Brassica napus. Plant Cell 28, 1263–1278 (2016).

Gabur, I. et al. Connecting genome structural variation with complex traits in crop plants. Theor. Appl. Genet. 132, 733–750 (2019).

Xu, L. et al. Genome-wide association study reveals the genetic architecture of flowering time in rapeseed (Brassica napus L.). DNA Res. 23, 43–52 (2016).

Schiessl, S. et al. Capturing sequence variation among flowering-time regulatory gene homologs in the allopolyploid crop species Brassica napus. Front. Plant Sci. 5, 404 (2014).

Long, Y. et al. Flowering time quantitative trait loci analysis of oilseed Brassica in multiple environments and genomewide alignment with Arabidopsis. Genetics 177, 2433–2444 (2007).

Pendleton, M. et al. Assembly and diploid architecture of an individual human genome via single-molecule technologies. Nat. Methods 12, 780 (2015).

Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).

Xie, T. et al. De novo plant genome assembly based on chromatin interactions: a case study of Arabidopsis thaliana. Mol. Plant 8, 489–492 (2015).

Chin, C. S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050–1054 (2016).

Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

Chin, C. S. et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10, 563–569 (2013).

Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).

Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).

Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).

Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).

Katoh, K. et al. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005).

Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000).

Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

Carlson, C. S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).

Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

Letunic, I. & Bork, P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res. 44, W242–W245 (2016).

Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

Delcher, A. L., Salzberg, S. L. & Phillippy, A. M. Using MUMmer to identify similar regions in large sequence sets. Curr. Protoc. Bioinforma. 0, 10.3.1–10.3.18 (2003).

Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics 25, 1754–1760 (2009).

McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).

Sun, S. et al. Extensive intraspecific gene order and gene structural variations between Mo17 and other maize genomes. Nat. Genet. 50, 1289–1295 (2018).

Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).

Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).

Li, M. X. et al. Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum. Genet. 131, 747–756 (2012).

Yang, W. et al. Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variation in rice. Nat. Commun. 5, 5087 (2014).

Kokot, M., Długosz, M. & Deorowicz, S. KMC 3: counting and manipulating k-mer statistics. Bioinformatics 33, 2759–2761 (2017).

Etherington, G. J., Ramirez-Gonzalez, R. H. & MacLean, D. bio-samtools 2: a package for analysis and visualization of sequence and alignment data with SAMtools in Ruby. Bioinformatics 31, 2565–2567 (2015).