Characterization of transcriptome and development of novel EST-SSR makers based on next-generation sequencing technology in Neolitsea sericea (Lauraceae) endemic to East Asian land-bridge islands

Molecular Breeding - Tập 35 - Trang 1-15 - 2015
Lu-Yao Chen1, Ya-Nan Cao1, Na Yuan1, Koh Nakamura2, Guo-Ming Wang3, Ying-Xiong Qiu1
1Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, and College of Life Sciences, Zhejiang University, Hangzhou, China
2Herbarium (HAST), Biodiversity Research Center, Academia Sinica, Nangang, Taiwan
3Zhoushan Forestry Academy, Zhoushan, China

Tóm tắt

Neolitsea sericea (Lauraceae), endemic to East Asian land-bridge islands, is an economically important tree species because of its characteristic aroma and timber uses. However, due to the lack of efficient molecular markers, the genetic diversity and historical demography of this endemic species is not clearly understood. In this study, we performed high-throughput transcriptome sequencing of N. sericea leaves using the Illumina HiSeq™ 2000 sequencing platform, and generated large transcript sequences for functional characterization and development of gene-associated SSR markers. A total of 68,624 unigenes (mean length 733 bp) were assembled from about 54.7 million reads, and 41,130 (59.94 %) unigenes of all the assembled unigenes showed similarity to public databases. From 68,624 unigenes, 13,213 expressed sequence tag–simple sequence repeats (EST-SSRs) were identified. Di-nucleotide SSRs were the most abundant motif (36.5 %), followed by mono- (32.3 %) and tri-nucleotide (26.5 %) repeats. From the 13,213 EST-SSRs, 1191 primer pairs were designed for marker mining. After selecting 131 of these pairs at random for further validation, 13 polymorphic pairs were identified as polymorphic SSR loci. These 13 EST-SSR markers showed intermediate levels of genetic diversity (e.g., N A = 7.15; mean H E = 0.51) when surveyed across six populations from East China (2), Taiwan (1), Korea (1), and the Ryukyus (1) and Honshu (1) of Japan. Both genetic distance and structure analyses identified two genetic clusters largely congruent with recent findings revealed by variations in chloroplast DNA sequences and genomic SSRs. The EST-SSR markers developed in our research will be an information resource for future studies on ecological, evolutionary, and conservation genomics in N. sericea.

Tài liệu tham khảo

Antao T, Lopes A, Lopes RJ, Beja-Pereira A, Luikart G (2008) LOSITAN: a workbench to detect molecular adaptation based on a Fst-outlier method. BMC Bioinformatics 9:323 Arai N, Kamitani T (2005) Erratum: seed rain and seedling establishment of the dioecious tree Neolitsea sericea (Lauraceae): effects of tree sex and density on invasion into a conifer plantation in central Japan. Can J Bot 83:1144–1150 Areshchenkova T, Ganal MW (2002) Comparative analysis of polymorphism and chromosomal location of tomato microsatellite markers isolated from different sources. Theor Appl Genet 104:229–235 Beaumont MA, Nichols RA (1996) Evaluating loci for use in the genetic analysis of population structure. Proc R Soc Lond B Biol Sci 263:1619–1626 Bonin A, Nicole F, Pompanon F, Miaud C, Taberlet P (2007) Population adaptive index: a new method to help measure intraspecific genetic diversity and prioritize populations for conservation. Conserv Biol 21:697–708 Bottin L, Verhaegen D, Tassin J, Olivieri I, Vaillant A, Bouvet JM (2005) Genetic diversity and population structure of an insular tree, Santalum austrocaledonicum in New Caledonian archipelago. Mol Ecol 14:1979–1989 Cardle L, Ramsay L, Milbourne D, Macaulay M, Marshall D, Waugh R (2000) Computational and experimental characterization of physically clustered simple sequence repeats in plants. Genetics 156:847–854 Cavagnaro PF, Senalik DA, Yang L, Simon PW, Harkins TT, Kodria CD, Huang S, Weng Y (2010) Genome-wide characterization of simple sequence repeats in cucumber (Cucumis sativus L.). BMC Genom 11:569 Chapuis MP, Estoup A (2007) Microsatellite null alleles and estimation of population differentiation. Mol Biol Evol 24:621–631 Chung MG, Chung MY, Oh GS, Epperson BK (2000) Spatial genetic structure in a Neolitsea sericea population (Lauraceae). Heredity 85:490–497 Cordeiro GM, Casu R, McIntyre CL, Manners JM, Henry RJ (2001) Microsatellite markers from sugarcane (Saccharum spp.) ESTs cross transferable to erianthus and sorghum. Plant Sci 160:1115–1123 De Mita S, Thuillet AC, Gay L, Ahmadi N, Manel S, Ronfort J, Vigouroux Y (2013) Detecting selection along environmental gradients: analysis of eight methods and their effectiveness for outbreeding and selfing populations. Mol Ecol 22:1383–1399 Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1–38 Eujayl I, Sorrells M, Baum M, Wolters P, Powell W (2001) Assessment of genotypic variation among cultivated durum wheat based on EST-SSRs and genomic SSRs. Euphytica 119:39–43 Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14:2611–2620 Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–1587 Foll M, Gaggiotti O (2008) A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics 180:977–993 Garcia de Leaniz C, Fleming IA, Einum S, Verspoor E, Jordan WC, Consuegra S, Aubin-Horth N, Lajus D, Letcher BH, Youngson AF, Webb JH, Vøllestad LA, Villanueva B, Ferguson A, Quinn TP (2007) A critical review of adaptive genetic variation in Atlantic salmon: implications for conservation. Biol Rev Camb Philos Soc 82:173–211 Goudet J (2001) Fstat, a program to estimate and test gene diversities and fixation indices. Version 2.9.3. http://www2.unil.ch/popgen/softwares/fstat.htm Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652 Guo R, Mao YR, Cai JR, Wang JY, Wu J, Qiu YX (2014) Characterization and cross-species transferability of EST–SSR markers developed from the transcriptome of Dysosma versipellis (Berberidaceae) and their application to population genetic studies. Mol Breed 34:1733–1746 Han XJ, Wang YD, Chen YC, Lin LY, Wu QK (2013) Transcriptome sequencing and expression analysis of terpenoid biosynthesis genes in Litsea cubeba. PLoS One 8:e76890 Jiang XM, Wu YF, Xiao FM, Xiong ZY, Xu HN (2014) Transcriptome analysis for leaves of five chemical types in Cinnamomum camphora. Hereditas 36:58–68 Johnson MS, Black R (2008) Adaptive responses of independent traits to the same environmental gradient in the intertidal snail Bembicium vittatum. Heredity 101:83–91 Kalinowski ST, Taper ML, Marshall TC (2007) Revising how the computer program CERVUS accommodates genotyping error increases success in paternity assignment. Mol Ecol 16:1099–1106 Kantety RV, La Rota M, Matthews DE, Sorrells ME (2002) Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol Biol 48:501–510 Katti MV, Ranjekar PK, Gupta VS (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol 18:1161–1167 Kauer MO, Dieringer D, Schlötterer C (2003) A microsatellite variability screen for positive selection associated with the “out of Africa” habitat expansion of Drosophila melanogaster. Genetics 165:1137–1148 Kaur S, Cogan NO, Pembleton LW, Shinozuka M, Savin KW, Materne M, Forster JW (2011) Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery. BMC Genom 12:265 Koilkonda P, Sato S, Tabata S, Shirasawa K, Hirakawa H, Sakai H, Sasamoto S, Watanabe A, Wada T, Kishida Y, Tsuruoka H, Fujishiro T, Yamada M, Kohara M, Suzuki S, Hasegawa M, Kiyoshima H, Isobe S (2012) Large-scale development of expressed sequence tag-derived simple sequence repeat markers and diversity analysis in Arachis spp. Mol Breed 30:125–138 Komae H, Hayashi N (1972) Terpenes from Actinodaphne, Machilus and Neolitsea species. Phytochemistry 11:1181–1182 Kumari K, Muthamilarasan M, Misra G, Gupta S, Subramanian A, Parida SK, Chattopadhyay D, Prasad M (2013) Development of eSSR-markers in Setaria italica and their applicability in studying genetic diversity, cross-transferability and comparative mapping in millet and non-millet species. PLoS One 8:e67742 Kumpatla SP, Mukhopadhyay S (2005) Mining and survey of simple sequence repeats in expressed sequence tags of dicotyledonous species. Genome 48:985–998 Lam SH, Chen CK, Wang JS, Lee SS (2008) Investigation of flavonoid glycosides from Neolitsea sericea var. aurata via the general method and HPLC-SPE-NMR. J Chin Chem Soc 55:449–455 Langella O (1999) Populations v1.2.28 (12/5/2002): a population genetic software. CNRS UPR9034. http://www.pge.cnrs-gif.fr/bioinfo/populations/index.php). Accessed 12 September 2011 Lee SS, Lai YC, Chen CK, Tseng LH, Wang CY (2007) Characterization of isoquinoline alkaloids from Neolitsea sericea var. aurata by HPLC-SPE-NMR. J Nat Prod 70:637–642 Lee JH, Lee DH, Choi BH (2013) Phylogeography and genetic diversity of East Asian Neolitsea sericea (Lauraceae) based on variations in chloroplast DNA sequences. J Plant Res 126:193–202 Li DJ, Deng Z, Qin B, Liu XH, Men ZH (2012) De novo assembly and characterization of bark transcriptome using Illumina sequencing and development of EST-SSR markers in rubber tree (Hevea brasiliensis Muell. Arg.). BMC Genom 13:192 Liu TM, Zhu SY, Tang QM, Chen P, Yu YT, Tang SW (2013a) De novo assembly and characterization of transcriptome using Illumina paired-end sequencing and identification of CesA gene in ramie (Boehmeria nivea L. Gaud). BMC Genom 14:125 Liu ZP, Chen TL, Ma LC, Zhao ZG, Zhao PX, Nan ZB, Wang YR (2013b) Global transcriptome sequencing using the Illumina platform and the development of EST-SSR markers in autotetraploid alfalfa. PLoS One 8:e83549 Luikart G, England PR, Tallmon D, Jordan S, Taberlet P (2003) The power and promise of population genomics: from genotyping to genome typing. Nat Rev Genet 4:981–994 Luro FL, Costantino G, Terol J, Argout X, Allario T, Wincker P, Talon M, Ollitrault P, Morillon R (2008) Transferability of the EST-SSRs developed on Nules clementine (Citrus clementina Hort ex Tan) to other Citrus species and their effectiveness for genetic mapping. BMC Genom 9:287 Marsden CD, Woodroffe R, Mills MG, Mcnutt JW, Creel S, Groom R, Emmanuel M, Cleaveland S, Kat P, Rasmussen GS, Ginsberg J, Lines R, André JM, Begg C, Wayne RK, Mable BK (2012) Spatial and temporal patterns of neutral and adaptive genetic variation in the endangered African wild dog (Lycaon pictus). Mol Ecol 21:1379–1393 Morgante M, Hanafey M, Powell W (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat Genet 30:194–200 Niu J, Hou XY, Fang CL, An JY, Ha DL, Qiu L, Ju YX, Zhao HY, Du WZ, Qi J, Zhang ZX, Liu GN, Lin SZ (2015) Transcriptome analysis of distinct Lindera glauca tissues revealed the differences in the unigenes related to terpenoid biosynthesis. Gene 559:22–30 Parchman TL, Geist KS, Grahnen JA, Benkman CW, Buerkle CA (2010) Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery. BMC Genom 11:180 Pertea G, Huang X, Liang F, Antonescu V, Sultana R et al (2003) TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19:651–652 Poncet V, Rondeau M, Tranchant C, Cayrel A, Hamon S, de Kochko A, Hamon P (2006) SSR mining in coffee tree EST databases: potential use of EST-SSRs as markers for the Coffea genus. Mol Genet Genomics 276:436–449 Pop M, Salzberg SL (2008) Bioinformatics challenges of new sequencing technology. Trends Genet 24:142–149 Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959 Qiu LJ, Yang C, Tian B, Yang JB, Liu A (2010) Exploiting EST databases for the development and characterization of EST-SSR markers in castor bean (Ricinus communis L.). BMC Plant Biol 10:278 Ramu P, Kassahun B, Senthilvel S, Ashok Kumar C, Jayashree B, Folkertsma RT, Reddy LA, Kuruvinashetti MS, Haussmann BI, Hash CT (2009) Exploiting rice-sorghum synteny for targeted development of EST-SSRs to enrich the sorghum genetic linkage map. Theor Appl Genet 119:1193–1204 Reeksting BJ, Coetzer N, Mahomed W, Engelbrecht J, van den Berg N (2014) De novo sequencing, assembly, and analysis of the root transcriptome of Persea americana (Mill.) in response to Phytophthora cinnamomi and flooding. PLoS One 9:e86399 Rousset F (2008) Genepop’007: a complete reimplementation of the Genepop software for Windows and Linux. Mol Ecol Resour 8:103–106 Rozen S, Skaletsky H (1999) Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol 132:365–386 Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425 Sharma MC, Ohira T, Yatagai M (1993) Extractives of Neolitsea sericea: a new hydroxyl steroidal ketone, and other compounds from the heartwood of Neolitsea sericea. Mokuzai Gakkaishi 39:939–943 Takanose Y, Kamitani T (2003) Fruiting of fleshy-fruited plants and abundance of frugivorous birds: phonological correspondence in a temperate forest in central Japan. Ornithol Sci 2:25–32 Takeda K, Horibe I, Minato H (1970) Sesquiterpenes of Lauraceae plants. Part II. Neosericenine, a component of the leaf of Neolitsea sericea Koidz. J Chem Soc C 11:1547–1549 Takezaki N, Nei M (1996) Genetic distances and reconstruction of phylogenetic trees from microsatellite DNA. Genetics 144:389–399 Temnykh S, DeClerck G, Lukashova A, Lipovich L, Cartinhour S, McCouch SR (2001) Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res 11:1441–1452 Thiel T, Michalek W, Varshney RK, Graner A (2003) Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet 106:411–422 Tóth G, Gáspári Z, Jurka J (2000) Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res 10:967–981 Varshney RK, Thiel T, Stein N, Langridge P, Graner A (2002) In silico analysis on frequency and distribution of microsatellites in ESTs of some cereal species. Cell Mol Biol Lett 7:537–546 Varshney RK, Graner A, Sorrells ME (2005) Genic microsatellite markers in plants: features and applications. Trends Biotechnol 23:48–55 Wang S, Xie Y (2004) China species red list. Higher Education Press, Beijing Wang ZS, An SQ, Liu H, Leng X, Zheng JW, Liu YH (2005) Genetic structure of the endangered plant Neolitsea sericea (Lauraceae) from the Zhoushan archipelago using RAPD markers. Ann Bot 95:305–313 Wang Z, Fang B, Chen J, Zhang X, Luo Z, Huang L, Chen X, Li Y (2010) De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweet potato (Ipomoea batatas). BMC Genom 11:726 Wei WL, Qi XQ, Wang LH, Zhang YX, Hua W, Li DH, Lv HX, Zhang XR (2011) Characterization of the sesame (Sesamum indicum L.) global transcriptome using Illumina paired-end sequencing and development of EST-SSR markers. BMC Genom 12:451 Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358–1370 Xia W, Xiao Y, Liu Z, Luo Y, Mason AS, Fan HK, Yang YD, Zhao SL, Peng M (2014) Development of gene-based simple sequence repeat markers for association analysis in Cocos nucifera. Mol Breed 34:525–535 Yin D, Wang Y, Zhang XG, Li HM, Lu X, Lu X, Zhang JS, Zhang WK, Chen SY (2013) De novo assembly of the peanut (Arachis hypogaea L.) seed transcriptome revealed candidate unigenes for oil accumulation pathways. PLoS One 8:e73767 Yoon WJ, Moon JY, Kang JY, Kim GO, Lee NH, Hyun CG (2010) Neolitsea sericea essential oil attenuates LPS-induced inflammation in RAW 264.7 macrophages by suppressing NF-kappaB and MAPK activation. Nat Prod Commun 5:1311–1316 Yu JK, Dake TM, Singh S, Benscher D, Li W, Gill B, Sorrells ME (2004) Development and mapping of EST-derived simple sequence repeat markers for hexaploid wheat. Genome 47:805–818 Yumoto T (1987) Pollination systems of a warm temperate evergreen broad-leaved forest in Yaku Island. Ecol Res 2:133–145 Zalapa JE, Cuevas H, Zhu H, Steffan S, Senalik D, Zeldin E, McCown B, Harbut R, Simon P (2012) Using Next-generation sequencing approaches to isolate simple sequence repeat (SSR) loci in the plant sciences. Am J Bot 99:193–208 Zhai SN, Yan XY, Nakamura K, Mishima M, Qiu YX (2010) Isolation of compound microsatellite markers for the endangered plant Neolitsea sericea (Lauraceae). Am J Bot 97:e139–e141 Zhai SN, Comes HP, Nakamura K, Yan HF, Qiu YX (2012) Late Pleistocene lineage divergence among populations of Neolitsea sericea (Lauraceae) across a deep sea-barrier in the Ryukyu Islands. J Biogeogr 39:1347–1360