The chromosome-level genome assembly of <i>Gentiana dahurica</i> (Gentianaceae) provides insights into gentiopicroside biosynthesis

DNA Research - Tập 29 Số 2 - 2022
Ting Li1, Yu Xi1, Yumeng Ren1, Minghui Kang1, Wenjie Yang1, Landi Feng1, Quanjun Hu1
1Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, China

Tóm tắt

Abstract Gentiana dahurica Fisch. is a perennial herb of the family Gentianaceae. This species is used as a traditional Tibetan medicine because of its rich gentiopicroside constituents. Here, we generate a high-quality, chromosome-level genome of G. dahurica with a total length of 1,416.54 Mb. Comparative genomic analyses showed that G. dahurica shared one whole-genome duplication (WGD) event with Gelsemium sempervirens of the family Gelsemiaceaei and had one additional species-specific WGD after the ancient whole-genome triplication with other eudicots. Further transcriptome analyses identified numerous enzyme coding genes and the transcription factors related to gentiopicroside biosynthesis. A set of candidate cytochrome P450 genes were identified for being involved in biosynthetic shifts from swertiamarin to gentiopicroside. Both gene expressions and the contents measured by high-performance liquid chromatography indicated that the gentiopicrosides were mainly synthesized in the rhizomes with the highest contents. In addition, we found that two above-mentioned WGDs, contributed greatly to the identified candidate genes involving in gentiopicroside biosynthesis. The first reference genome of Gentianaceae we generated here will definitely accelerate evolutionary, ecological, and pharmaceutical studies of this family.

Từ khóa


Tài liệu tham khảo

Zhang, 2009, Molecular phylogeny and biogeography of Gentiana sect. Cruciata (Gentianaceae) based on four chloroplast DNA datasets, Taxon, 58, 862, 10.1002/tax.583014

Ho, 2001

Meng, 2013, Research progress in classification and identification of Sect. Cruciata Gaudin in Gentiana (Tourn.) L, Chin. Tradit. Herb. Drugs, 44, 2330

Hua, 2014, An insight into the genes involved in secoiridoid biosynthesis in Gentiana macrophylla by RNA-seq, Mol. Biol. Rep, 41, 4817, 10.1007/s11033-014-3352-x

Zhou, 2016, De novo sequencing transcriptome of endemic Gentiana straminea (Gentianaceae) to identify genes involved in the biosynthesis of active ingredients, Gene, 575, 160, 10.1016/j.gene.2015.08.055

Zhang, 2015, De novo assembly and characterization of the transcriptome of the Chinese Medicinal Herb, , Int. J. Mol. Sci, 16, 11550, 10.3390/ijms160511550

Geu-Flores, 2012, An alternative route to cyclic terpenes by reductive cyclization in iridoid biosynthesis, Nature, 492, 138, 10.1038/nature11692

Kang, 2021, A chromosome-level Camptotheca acuminata genome assembly provides insights into the evolutionary origin of camptothecin biosynthesis, Nat. Commun, 12, 1, 10.1038/s41467-021-23872-9

Salim, 2014, 7-deoxyloganetic acid synthase catalyzes a key 3 step oxidation to form 7-deoxyloganetic acid in Catharanthus roseus iridoid biosynthesis, Phytochemistry, 101, 23, 10.1016/j.phytochem.2014.02.009

Zhan, 2020, Monoterpene indole alkaloids with diverse skeletons from the stems of Rauvolfia vomitoria and their acetylcholinesterase inhibitory activities, Phytochemistry, 177, 112450, 10.1016/j.phytochem.2020.112450

Vranová, 2013, Network analysis of the MVA and MEP pathways for isoprenoid synthesis, Annu. Rev. Plant Biol, 64, 665, 10.1146/annurev-arplant-050312-120116

Denoeud, 2014, The coffee genome provides insight into the convergent evolution of caffeine biosynthesis, Science, 345, 1181, 10.1126/science.1255274

Yang, 2021, The chromosome scale high-quality genome assembly of Panax notoginseng provides insight into dencichine biosynthesis, Plant Biotechnol. J, 19, 869, 10.1111/pbi.13558

Seemann, 2006, Isoprenoid biosynthesis in plant chloroplasts via the MEP pathway: direct thylakoid/ferredoxin-dependent photoreduction of GcpE/IspG, FEBS Lett, 580, 1547, 10.1016/j.febslet.2006.01.082

Sun, 2012, Transcriptome analysis reveals putative genes involved in iridoid biosynthesis in , Int. J. Mol. Sci, 13, 13748, 10.3390/ijms131013748

Guo, 2020, Gentianelloids A and B, immunosuppressive 10,11-seco-gentianellane sesterterpenoids from the traditional uighur medicine Gentianella turkestanorum, J. Org. Chem, 85, 5511, 10.1021/acs.joc.0c00272

Patel, 2012, NGS QC toolkit: a toolkit for quality control of next generation sequencing data, PLoS One, 7, e30619, 10.1371/journal.pone.0030619

van Berkum, 2010, Hi-C: a method to study the three-dimensional architecture of genomes, J. Vis. Exp, 39, 1869

Ranallo-Benavidez, 2020, GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes, Nat. Commun, 11, 1432, 10.1038/s41467-020-14998-3

Marçais, 2011, A fast, lock-free approach for efficient parallel counting of occurrences of k-mers, Bioinformatics, 27, 764, 10.1093/bioinformatics/btr011

Sun, 2018, findGSE: estimating genome size variation within human and Arabidopsis using k-mer frequencies, Bioinformatics, 34, 550, 10.1093/bioinformatics/btx637

Koren, 2017, Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation, Genome Res, 27, 722, 10.1101/gr.215087.116

Walker, 2014, Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement, PLoS One, 9, e112963, 10.1371/journal.pone.0112963

Durand, 2016, Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments, Cell Syst, 3, 95, 10.1016/j.cels.2016.07.002

Durand, 2016, Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom, Cell Syst, 3, 99, 10.1016/j.cels.2015.07.012

Dudchenko, 2017, De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds, Science, 356, 92, 10.1126/science.aal3327

Dudchenko, 2018, The Juicebox Assembly Tools module facilitates de novo assembly of mammalian genomes with chromosome-length scaffolds for under $1000, bioRxiv

Tarailo-Graovac, 2009, Using RepeatMasker to identify repetitive elements in genomic sequences, Curr. Protoc. Bioinformatics, 25, 1, 10.1002/0471250953.bi0410s25

Price, 2005

Ellinghaus, 2008, LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons, BMC Bioinformatics, 9, 18, 10.1186/1471-2105-9-18

Xu, 2007, LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons, Nucleic Acids Res, 35, W265, 10.1093/nar/gkm286

Ou, 2018, LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons, Plant Physiol, 176, 1410, 10.1104/pp.17.01310

Ossowski, 2010, The rate and molecular spectrum of spontaneous mutations in Arabidopsis thaliana, Science, 327, 92, 10.1126/science.1180677

Yang, 2020, Prickly waterlily and rigid hornwort genomes shed light on early angiosperm evolution, Nat. Plants, 6, 215, 10.1038/s41477-020-0594-6

Bolger, 2014, Trimmomatic: a flexible trimmer for Illumina sequence data, Bioinformatics, 30, 2114, 10.1093/bioinformatics/btu170

Haas, 2013, De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis, Nat. Protoc, 8, 1494, 10.1038/nprot.2013.084

Haas, 2003, Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies, Nucleic Acids Res, 31, 5654, 10.1093/nar/gkg770

Kaul, 2000, Analysis of the genome sequence of the flowering plant Arabidopsis thaliana, Nature, 408, 796, 10.1038/35048692

Kim, 2017, New reference genome sequences of hot pepper reveal the massive evolution of plant disease-resistance genes by retroduplication, Genome Biol, 18, 1, 10.1186/s13059-017-1341-9

Miettinen, 2014, The seco-iridoid pathway from Catharanthus roseus, Nat. Commun, 5, 3606, 10.1038/ncomms4606

Iorizzo, 2016, A high-quality carrot genome assembly provides new insights into carotenoid accumulation and asterid genome evolution, Nat. Genet, 48, 657, 10.1038/ng.3565

Hoopes, 2018, Genome assembly and annotation of the medicinal plant Calotropis gigantea, a producer of anticancer and antimalarial Cardenolides, G3 (Bethesda), 8, 385, 10.1534/g3.117.300331

Unver, 2017, Genome of wild olive and the evolution of oil biosynthesis, Proc. Natl. Acad. Sci. USA, 114, E9413, 10.1073/pnas.1708621114

Xiao, 2015, The resurrection genome of Boea hygrometrica: a blueprint for survival of dehydration, Proc. Natl. Acad. Sci. USA, 112, 5833, 10.1073/pnas.1505811112

Franke, 2019, Gene discovery in gelsemium highlights conserved gene clusters in monoterpene indole alkaloid biosynthesis, ChemBioChem, 20, 83, 10.1002/cbic.201800592

Yoshida, 2019, Genome sequence of Striga asiatica provides insight into the evolution of plant parasitism, Curr. Biol, 29, 3041, 10.1016/j.cub.2019.07.086

Dong, 2018, High-quality assembly of the reference genome for scarlet sage, Salvia splendens, an economically important ornamental plant, GigaScience, 7, 10.1093/gigascience/giy068

Slater, 2005, Automated generation of heuristics for biological sequence comparison, BMC Bioinformatics, 6, 31, 10.1186/1471-2105-6-31

Stanke, 2004, AUGUSTUS: a web server for gene finding in eukaryotes, Nucleic Acids Res, 32, W309, 10.1093/nar/gkh379

Majoros, 2004, TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders, Bioinformatics, 20, 2878, 10.1093/bioinformatics/bth315

Haas, 2008, Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments, Genome Biol, 9, R7, 10.1186/gb-2008-9-1-r7

Bairoch, 2000, The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000, Nucleic Acids Res, 28, 45, 10.1093/nar/28.1.45

Zdobnov, 2001, InterProScan—an integration platform for the signature-recognition methods in InterPro, Bioinformatics, 17, 847, 10.1093/bioinformatics/17.9.847

Conesa, 2005, Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research, Bioinformatics, 21, 3674, 10.1093/bioinformatics/bti610

Kanehisa, 2012, KEGG for integration and interpretation of large-scale molecular data sets, Nucleic Acids Res, 40, D109, 10.1093/nar/gkr988

Chen, 2020, Gardenia jasminoides Ellis: ethnopharmacology, phytochemistry, and pharmacological and industrial applications of an important traditional Chinese medicine, J. Ethnopharmacol, 257, 112829, 10.1016/j.jep.2020.112829

Emms, 2019, OrthoFinder: phylogenetic orthology inference for comparative genomics, Genome Biol, 20, 10.1186/s13059-019-1832-y

Katoh, 2013, MAFFT multiple sequence alignment software version 7: improvements in performance and usability, Mol. Biol. Evol, 30, 772, 10.1093/molbev/mst010

Stamatakis, 2014, RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies, Bioinformatics, 30, 1312, 10.1093/bioinformatics/btu033

Yang, 2007, PAML 4: phylogenetic analysis by maximum likelihood, Mol. Biol. Evol, 24, 1586, 10.1093/molbev/msm088

Hedges, 2006, TimeTree: a public knowledge-base of divergence times among organisms, Bioinformatics, 22, 2971, 10.1093/bioinformatics/btl505

Bie, 2006, CAFE: a computational tool for the study of gene family evolution, Bioinformatics, 22, 1269, 10.1093/bioinformatics/btl097

Jaillon, 2007, The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla, Nature, 449, 463, 10.1038/nature06148

Sun, 2021, WGDI: A user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes, bioRxiv

Patro, 2017, Salmon provides fast and bias-aware quantification of transcript expression, Nat. Methods, 14, 417, 10.1038/nmeth.4197

Leek, 2012, The sva package for removing batch effects and other unwanted variation in high-throughput experiments, Bioinformatics, 28, 882, 10.1093/bioinformatics/bts034

Love, 2014, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2, Genome Biol, 15, 1, 10.1186/s13059-014-0550-8

Haynes, 2013, Encyclopedia of Systems Biology, 78, 10.1007/978-1-4419-9863-7_1215

Subramanian, 2007, GSEA-P: a desktop application for Gene Set Enrichment Analysis, Bioinformatics, 23, 3251, 10.1093/bioinformatics/btm369

Langfelder, 2008, WGCNA: an R package for weighted correlation network analysis, BMC Bioinformatics, 9, 559, 10.1186/1471-2105-9-559

Yang, 2021, UHPLC-QQQ-MS/MS assay for the quantification of dianthrones as potential toxic markers of Polygonum multiflorum Thunb: applications for the standardization of traditional Chinese medicines (TCMs) with endogenous toxicity, Chin. Med, 16, 51, 10.1186/s13020-021-00463-w

Belton, 2012, Hi-C: a comprehensive technique to capture the conformation of genomes, Methods, 58, 268, 10.1016/j.ymeth.2012.05.001

van de Peer, 2009, The flowering world: a tale of duplications, Trends Plant Sci, 14, 680, 10.1016/j.tplants.2009.09.001

Guo, 2021, Immunosuppressive gentianellane-type sesterterpenoids from the traditional Uighur medicine Gentianella turkestanorum, Phytochemistry, 187, 112780, 10.1016/j.phytochem.2021.112780

Thamm, 2016, Discovery and metabolic engineering of iridoid/secoiridoid and monoterpenoid indole alkaloid biosynthesis, Phytochem. Rev, 15, 339, 10.1007/s11101-016-9468-y

Guo, 2021, Secoiridoids and triterpenoids from the traditional Tibetan medicine Gentiana veitchiorum and their immunosuppressive activity, Phytochemistry, 192, 112961, 10.1016/j.phytochem.2021.112961

Asada, 2013, A 7-deoxyloganetic acid glucosyltransferase contributes a key step in secologanin biosynthesis in Madagascar Periwinkle[C][W][OPEN], Plant Cell, 25, 4123, 10.1105/tpc.113.115154

Sadre, 2016, Metabolite diversity in alkaloid biosynthesis: a multilane (diastereomer) highway for camptothecin synthesis in Camptotheca acuminata, Plant Cell, 28, 1, 10.1105/tpc.16.00193

Sun, 2020, Cytochrome P450 family: genome-wide identification provides insights into the rutin synthesis pathway in Tartary buckwheat and the improvement of agricultural product quality, Int. J. Biol. Macromol, 164, 4032, 10.1016/j.ijbiomac.2020.09.008

Zhang, 2017, The Medicinal Herb Panax notoginseng genome provides insights into ginsenoside biosynthesis and genome evolution, Mol. Plant, 10, 903, 10.1016/j.molp.2017.02.011