Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton

Nature Genetics - Tập 51 Số 4 - Trang 739-748 - 2019
Yan Hu1, Jiedan Chen1, Lei Fang1, Zhiyuan Zhang1, Wei Ma1, Yongchao Niu2, Longzhen Ju3, Jieqiong Deng3, Ting Zhao1, Jinmin Lian2, Kobi Baruch4, David D. Fang5, Xia Liu6, Yong‐Ling Ruan1, Mehboob‐ur‐ Rahman7, Jinlei Han8, Kai Wang8, Qiong Wang3, Huaitong Wu3, Gaofu Mei3, Yihao Zang3, Zegang Han3, Chenyu Xu3, Weijuan Shen3, Duofeng Yang3, Zhanfeng Si1, Fan Dai1, Liangfeng Zou2, Fei Huang2, Yulin Bai6, Yugao Zhang6, Avital Brodt4, Hilla Ben-Hamo4, Xiefei Zhu3, Baoliang Zhou3, Xueying Guan3, Shuijin Zhu1, Xiao‐Ya Chen9, Tianzhen Zhang3
1Institute of Crop Science, Plant Precision Breeding Academy, Zhejiang Provincial Key Laboratory of Crop Genetic Resources, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
2Genosys Inc., Shenzhen, China
3State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China
4NRGene Ltd., Ness Ziona, Israel
5Cotton Fiber Bioscience Research Unit, US Department of Agriculture–Agricultural Research Service–Southern Regional Research Center, New Orleans, LA, USA
6Esquel Group, Wanchai, Hong Kong, China
7Plant Genomics and Molecular Breeding Laboratory, National Institute for Biotechnology and Genetic Engineering (NIBGE), Faisalabad, Pakistan
8Center for Genomics and Biotechnology, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Haixia Institute of Science and Technology, Fujian Agricultural and Forestry University, Fuzhou, China
9National Center for Gene Research, State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China

Tóm tắt

Từ khóa


Tài liệu tham khảo

Doebley, J. F., Gaut, B. S. & Smith, B. D. The molecular genetics of crop domestication. Cell 127, 1309–1321 (2006).

Endrizzi, J., Turcotte, E. & Kohel, J. Genetics, cytogenetics and evolution of Gossypium. Adv. Genet. 23, 271–375 (1985).

Wendel, J. F. New World tetraploid cottons contain Old World cytoplasm. Proc. Natl Acad. Sci. USA 86, 4132–4136 (1989).

Zhang, T. et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotechnol. 33, 531–537 (2015).

Liu, X. et al. Gossypium barbadense genome sequence provides insight into the evolution of extra-long staple fiber and specialized metabolites. Sci. Rep. 5, 14139 (2015).

Li, F. et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat. Biotechnol. 33, 524–530 (2015).

Yuan, D. et al. The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres. Sci. Rep. 5, 17662 (2015).

Du, X. et al. Resequencing of 243 diploid cotton accessions based on an updated A genome identifies the genetic basis of key agronomic traits. Nat. Genet. 50, 796–802 (2018).

Li, F. et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nat. Genet. 46, 567–572 (2014).

Paterson, A. H. et al. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature 492, 423–427 (2012).

Avni, R. et al. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication. Science 357, 93–97 (2017).

International Wheat Genome Sequencing Consortium. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 361, eaar7191 (2018).

Luo, M. C. et al. Genome sequence of the progenitor of the wheat D genome Aegilops tauschii. Nature 551, 498–502 (2017).

Zhao, G. Y. et al. The Aegilops tauschii genome reveals multiple impacts of transposonsNat. Plants 3, 946–955 (2017).

Guo, L. et al. The opium poppy genome and morphinan production. Science 362, 343–347 (2018).

Springer, N. M. et al. The maize W22 genome provides a foundation for functional genomics and transposon biology. Nat. Genet. 50, 1282–1288 (2018).

Luo, S. et al. The cotton centromere contains a Ty3-gypsy-like LTR retroelement. PLoS One 7, e35261 (2012).

Su, H. et al. Dynamic location changes of Bub1-phosphorylated-H2AThr133 with CENH3 nucleosome in maize centromeric regions. New Phytol. 214, 682–694 (2017).

Jiang, J. & Birchler, J. A. Plant Centromere Biology (Wiley-Blackwell, 2013).

Schneider, K. L., Xie, Z., Wolfgruber, T. K. & Presting, G. G. Inbreeding drives maize centromere evolution. Proc. Natl Acad. Sci. USA 113, 987–996 (2016).

Wang, K., Wu, Y., Zhang, W., Dawe, R. K. & Jiang, J. Maize centromeres expand and adopt a uniform size in the genetic background of oat. Genome Res. 24, 107–116 (2014).

Gong, Z. et al. Repeatless and repeat-based centromeres in potato: implications for centromere evolution. Plant Cell 24, 3559–3574 (2012).

Han, J. et al. Rapid proliferation and nucleolar organizer targeting centromeric retrotransposons in cotton. Plant J. 88, 992–1005 (2016).

Zhu, Z. et al. The NnCenH3 protein and centromeric DNA sequence profiles of Nelumbo nucifera Gaertn (sacred lotus) reveal the DNA structures and dynamics of centromeres in basal eudicots. Plant J. 87, 568–582 (2016).

International Brachypodium Initiative. Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463, 763–768 (2010).

Li, Y. et al. Centromeric DNA characterization in the model grass Brachypodium distachyon provides insights on the evolution of the genus. Plant J. 93, 1088–1101 (2018).

Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).

Wendel, J. F. & Cronn, R. C. Polyploidy and the evolutionary history of cotton. Adv. Agron. 78, 139–186 (2003).

Menzel, M. Y. & Brown, M. S. The significance of multivalent formation in three-species Gossypium hybrids. Genetics 39, 546–557 (1954).

Gerstel, D. U. Chromosomal translocations in interspecific hybrids of the genus Gossypium. Evolution 7, 11 (1953).

Brubaker, C. L., Paterson, A. H. & Wendel, J. F. Comparative genetic mapping of allotetraploid cotton and its diploid progenitors. Genome 42, 184–203 (1999).

Argout, X. et al. The genome of Theobroma cacao. Nat. Genet. 43, 101–108 (2011).

Fang, L. et al. Genomic insights into divergence and dual domestication of cultivated allotetraploid cottons. Genome Biol. 18, 33 (2017).

Westengen, O. T., Huaman, Z. & Heun, M. Genetic diversity and geographic pattern in early South American cotton domestication. Theor. Appl. Genet. 110, 392–402 (2005).

Han, L. B. et al. The dual functions of WLIM1a in cell elongation and secondary wall formation in developing cotton fibers. Plant Cell 25, 4421–4438 (2013).

Li, Y. et al. GbEXPATR, a species-specific expansin, enhances cotton fibre elongation through cell wall restructuring. Plant Biotechnol. J. 14, 951–963 (2016).

Andres, Z. et al. Control of vacuolar dynamics and regulation of stomatal aperture by tonoplast potassium uptake. Proc. Natl Acad. Sci. USA 111, E1806–E1814 (2014).

Ruan, Y. L., Llewellyn, D. J. & Furbank, R. T. The control of single-celled cotton fiber elongation by developmentally reversible gating of plasmodesmata and coordinated expression of sucrose and K+ transporters and expansin. Plant Cell 13, 47–60 (2001).

Barragan, V. et al. Ion exchangers NHX1 and NHX2 mediate active potassium uptake into vacuoles to regulate cell turgor and stomatal function in Arabidopsis. Plant Cell 24, 1127–1142 (2012).

Bassil, E. et al. The Arabidopsis Na+/H+ antiporters NHX1 and NHX2 control vacuolar pH and K+ homeostasis to regulate growth, flower development, and reproduction. Plant Cell 23, 3482–3497 (2011).

Hedrich, R., Sauer, N. & Neuhaus, H. E. Sugar transport across the plant vacuolar membrane: nature and regulation of carrier proteins. Curr. Opin. Plant Biol. 25, 63–70 (2015).

Meyer, S., De Angeli, A., Fernie, A. R. & Martinoia, E. Intra- and extra-cellular excretion of carboxylates. Trends Plant Sci. 15, 40–47 (2010).

Meyer, S. et al. Malate transport by the vacuolar AtALMT6 channel in guard cells is subject to multiple regulation. Plant J. 67, 247–257 (2011).

Nei, M. & Kumar, S. Molecular evolution and phylogenetics. Heredity 86, 385–385 (2000).

Wang, L. & Ruan, Y. L. Unraveling mechanisms of cell expansion linking solute transport, metabolism, plasmodesmtal gating and cell wall dynamics. Plant Signal. Behav. 5, 1561–1564 (2010).

Wang, L., Cook, A., Patrick, J. W., Chen, X. Y. & Ruan, Y. L. Silencing the vacuolar invertase gene GhVIN1 blocks cotton fiber initiation from the ovule epidermis, probably by suppressing a cohort of regulatory genes via sugar signaling. Plant J. 78, 686–696 (2014).

Wang, L. et al. Evidence that high activity of vacuolar invertase is required for cotton fiber and Arabidopsis root elongation through osmotic dependent and independent pathways, respectively. Plant Physiol. 154, 744–756 (2010).

Li, X. R., Wang, L. & Ruan, Y. L. Developmental and molecular physiological evidence for the role of phosphoenolpyruvate carboxylase in rapid cotton fibre elongation. J. Exp. Bot. 61, 287–295 (2010).

Zhang, Z. et al. Suppressing a putative sterol carrier gene reduces plasmodesmal permeability and activates sucrose transporter genes during cotton fiber elongation. Plant Cell 29, 2027–2046 (2017).

Naramoto, S. et al. ADP-ribosylation factor machinery mediates endocytosis in plant cells. Proc. Natl Acad. Sci. USA 107, 21890–21895 (2010).

Xu, J. & Scheres, B. Dissection of Arabidopsis ADP-RIBOSYLATION FACTOR 1 function in epidermal cell polarity. Plant Cell 17, 525–536 (2005).

Mittler, R., Finka, A. & Goloubinoff, P. How do plants feel the heat? Trends Biochem. Sci. 37, 118–125 (2012).

Kendrick, M. D. & Chang, C. Ethylene signaling: new levels of complexity and regulation. Curr. Opin. Plant Biol. 11, 479–485 (2008).

Raghavendra, A. S., Gonugunta, V. K., Christmann, A. & Grill, E. ABA perception and signalling. Trends Plant Sci. 15, 395–401 (2010).

Kohel, R., Richmond, T. & Lewis, C. Texas Marker-1. Description of a genetic standard for Gossypium hirsutum L. Crop Sci. 10, 670–671 (1970).

Paterson, A. H., Brubaker, C. L. & Wendel, J. F. A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol. Biol. Rep. 11, 122–127 (1993).

Zhang, M. et al. Preparation of megabase-sized DNA from a variety of organisms using the nuclei method for advanced genomics research. Nat. Protoc. 7, 467–478 (2012).

Wang, S. et al. Sequence-based ultra-dense genetic and physical maps reveal structural variations of allopolyploid cotton genomes. Genome Biol. 16, 108 (2015).

Van Ooijen, J. W. & Voorrips, R. JoinMap: version 3.0 (Plant Research International, 2001).

Wu, Y., Bhat, P. R., Close, T. J. & Lonardi, S. Efficient and accurate construction of genetic linkage maps from the minimum spanning tree of a graph. PLoS Genet. 4, e1000212 (2008).

Xie, T. et al. De novo plant genome assembly based on chromatin interactions: a case study of Arabidopsis thaliana. Mol. Plant 8, 489–492 (2015).

Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

Akdemir, K. C. & Chin, L. HiCPlotter integrates genomic data with interaction matrices. Genome Biol. 16, 198 (2015).

Guo, W. et al. A preliminary analysis of genome structure and composition in Gossypium hirsutum. BMC Genomics 9, 314 (2008).

Nagaki, K. et al. Sequencing of a rice centromere uncovers active genes. Nat. Genet. 36, 138–145 (2004).

Zang, C. et al. A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25, 1952–1958 (2009).

Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).

Wang, B. et al. Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing. Nat. Commun. 7, 11708 (2016).

Goodstein, D. M. et al. Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res. 40, D1178–D1186 (2012).

She, R., Chu, J. S., Wang, K., Pei, J. & Chen, N. GenBlastA: enabling BLAST to identify homologous gene sequences. Genome Res. 19, 143–149 (2009).

Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).

Stanke, M., Schoffmann, O., Morgenstern, B. & Waack, S. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7, 62 (2006).

Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).

Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).

Guigo, R. Assembling genes from predicted exons in linear time with dynamic programming. J. Comput. Biol. 5, 681–702 (1998).

Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).

Kent, W. J. Blat: the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).

Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).

Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).

Punta, M. et al. The Pfam protein families database. Nucleic Acids Res. 40, D290–D301 (2012).

Hunter, S. et al. InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res. 40, D306–D312 (2012).

Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999. Nucleic Acids Res. 27, 49–54 (1999).

Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).

Conesa, A. et al. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674–3676 (2005).

Kanehisa, M. et al. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 42, D199–D205 (2014).

Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 (1997).

Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33, D121–D124 (2005).

Nawrocki, E. P., Kolbe, D. L. & Eddy, S. R. Infernal 1.0: inference of RNA alignments. Bioinformatics 25, 1335–1337 (2009).

Nussbaumer, T. et al. MIPS PlantsDB: a database framework for comparative plant genome research. Nucleic Acids Res. 41, D1144–D1151 (2013).

Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).

Ou, S. & Jiang, N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 176, 01310 (2017).

Li, L., Stoeckert, C. J. Jr. & Roos, D. S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003).

Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).

Fryxell, P. A. The Natural History of the Cotton Tribe (Texas A&M University Press, 1979).

Grover, C. E., Kim, H., Wing, R. A., Paterson, A. H. & Wendel, J. F. Incongruent patterns of local and global genome size evolution in cotton. Genome Res. 14, 1474–1482 (2004).

Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).

Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

Talevich, E. et al. CNVkit: genome-wide copy number detection and visualization from targeted DNA sequencing. PLoS Comput. Biol. 12, e1004873 (2016).