Multi-omics prediction of oat agronomic and seed nutritional traits across environments and in distantly related populations
Tóm tắt
Integration of multi-omics data improved prediction accuracies of oat agronomic and seed nutritional traits in multi-environment trials and distantly related populations in addition to the single-environment prediction. Multi-omics prediction has been shown to be superior to genomic prediction with genome-wide DNA-based genetic markers (G) for predicting phenotypes. However, most of the existing studies were based on historical datasets from one environment; therefore, they were unable to evaluate the efficiency of multi-omics prediction in multi-environment trials and distantly related populations. To fill those gaps, we designed a systematic experiment to collect omics data and evaluate 17 traits in two oat breeding populations planted in single and multiple environments. In the single-environment trial, transcriptomic BLUP (T), metabolomic BLUP (M), G + T, G + M, and G + T + M models showed greater prediction accuracy than GBLUP for 5, 10, 11, 17, and 17 traits, respectively, and metabolites generally performed better than transcripts when combined with SNPs. In the multi-environment trial, multi-trait models with omics data outperformed both counterpart multi-trait GBLUP models and single-environment omics models, and the highest prediction accuracy was achieved when modeling genetic covariance as an unstructured covariance model. We also demonstrated that omics data can be used to prioritize loci from one population with omics data to improve genomic prediction in a distantly related population using a two-kernel linear model that accommodated both likely casual loci with large-effect and loci that explain little or no phenotypic variance. We propose that the two-kernel linear model is superior to most genomic prediction models that assume each variant is equally likely to affect the trait and can be used to improve prediction accuracy for any trait with prior knowledge of genetic architecture.
Tài liệu tham khảo
Alseekh S, Fernie AR (2018) Metabolomics 20 years on: what have we learned and what hurdles remain? Plant J 94:933–942. https://doi.org/10.1111/tpj.13950
Bekele WA, Wight CP, Chao S et al (2018) Haplotype-based genotyping-by-sequencing in oat genome research. Plant Biotechnol J 16:1452–1463. https://doi.org/10.1111/pbi.12888
Burgueño J, de los Campos G, Weigel K, Crossa J (2012) Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Sci 52: 707–719. https://doi.org/10.2135/cropsci2011.06.0299
Campbell MT, Hu H, Yeats TH, et al (2021a) Translating insights from the seed metabolome into improved prediction for lipid-composition traits in oat (Avena sativa L.). Genetics 217:. https://doi.org/10.1093/genetics/iyaa043
Campbell MT, Hu H, Yeats TH et al (2021b) Improving genomic prediction for seed quality traits in oat (Avena sativa L.) using trait-specific relationship matrices. Front Genet 12:1–12. https://doi.org/10.3389/fgene.2021.643733
Carlson MO, Montilla-Bascon G, Hoekenga OA et al (2019) Multivariate genome-wide association analyses reveal the genetic basis of seed fatty acid composition in oat (Avena sativa L.). G3 Genes Genomes Genet 9:2963–2975. https://doi.org/10.1534/g3.119.400228
Chan AW, Hamblin MT, Jannink JL (2016) Evaluating imputation algorithms for low-depth genotyping-by-sequencing (GBS) data. PLoS ONE 11:1–17. https://doi.org/10.1371/journal.pone.0160733
Covarrubias-Pazaran G (2016) Genome-Assisted prediction of quantitative traits using the r package sommer. PLoS ONE 11:1–15. https://doi.org/10.1371/journal.pone.0156744
de Abreu e Lima F, Li K, Wen W et al (2018) Unraveling lipid metabolism in maize with time-resolved multi-omics data. Plant J 93: 1102–1115. https://doi.org/10.1111/tpj.13833
Endelman JB (2011) Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 4:250–255. https://doi.org/10.3835/plantgenome2011.08.0024
Guo Z, Magwire MM, Basten CJ et al (2016) Evaluation of the utility of gene expression and metabolic information for genomic prediction in maize. Theor Appl Genet 129:2413–2427. https://doi.org/10.1007/s00122-016-2780-5
Hu H (2021) Multi-omics prediction of oat agronomic and seed nutritional traits across environments and in distantly related populations—Omics Data. CyVerse Data Commons. https://doi.org/10.25739/8p1e-0931
Hu H, Gutierrez-Gonzalez JJ, Liu X et al (2020) Heritable temporal gene expression patterns correlate with metabolomic seed content in developing hexaploid oat seed. Plant Biotechnol J 18:1211–1222. https://doi.org/10.1111/pbi.13286
IMARC Group (2019) Oats market: global industry trends, share, size, growth, opportunity and forecast 2019–2024. http://www.reportlinker.com/p04715198-summary/view-report.html
Kawakami T, Backström N, Burri R et al (2014) Estimation of linkage disequilibrium and interspecific gene flow in Ficedula flycatchers by a newly developed 50k single-nucleotide polymorphism array. Mol Ecol Resour 14:1248–1260. https://doi.org/10.1111/1755-0998.12270
Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. https://doi.org/10.1186/1471-2105-9-559
Li B, Zhang N, Wang YG et al (2018) Genomic prediction of breeding values using a subset of SNPs identified by three machine learning methods. Front Genet 9:1–20. https://doi.org/10.3389/fgene.2018.00237
Liu X, Li YI, Pritchard JK (2019) Trans effects on gene expression can drive omnigenic inheritance. Cell 177:1022-1034.e6. https://doi.org/10.1016/j.cell.2019.04.014
Lorenz AJ, Smith KP (2015) Adding genetically distant individuals to training populations reduces genomic prediction accuracy in Barley. Crop Sci 55(6):2657–2667. https://doi.org/10.2135/cropsci2014.12.0827
MacLeod IM, Bowman PJ, Vander Jagt CJ et al (2016) Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits. BMC Genomics 17:1–21. https://doi.org/10.1186/s12864-016-2443-6
Malosetti M, Bustos-Korts D, Boer MP, Van Eeuwijk FA (2016) Predicting responses in multiple environments: issues in relation to genotype × Environment interactions. Crop Sci 56:2210–2222. https://doi.org/10.2135/cropsci2015.05.0311
Mathew B, Léon J, Sillanpää MJ (2018) Impact of residual covariance structures on genomic prediction ability in multienvironment trials. PLoS ONE 13:1–11. https://doi.org/10.1371/journal.pone.0201181
Meuwissen TH (2009) Accuracy of breeding values of “unrelated” individuals predicted by dense SNP genotyping. Genet Sel Evol 41(1):1–9. https://doi.org/10.1186/1297-9686-41-35
Moghaddar N, Khansefid M, Van Der Werf JHJ, Bolormaa S, Duijvesteijn N, Clark SA, Swan AA, Daetwyler HD, MacLeod IM (2019) Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations. Genet Sel Evol 51(1):1–14. https://doi.org/10.1186/s12711-019-0514-2
Moll P, Ante M, Seitz A, Reda T (2014) QuantSeq 3′ mRNA sequencing for RNA quantification. Nat Methods. https://doi.org/10.1038/nmeth.f.376
Montesinos-López OA, Montesinos-López A, Crossa J et al (2016) A genomic bayesian multi-trait and multi-environment model. G3 Genes Genomes Genet 6:2725–2774. https://doi.org/10.1534/g3.116.032359
Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20:289–290. https://doi.org/10.1093/bioinformatics/btg412
PepsiCo (2020) Avena sativa—OT3098 v1. https://wheat.pw.usda.gov/GG3/graingenes_downloads/oat-ot3098-pepsico
Pérez P, De Los CG (2014) Genome-wide regression and prediction with the BGLR statistical package. Genetics 198:483–495. https://doi.org/10.1534/genetics.114.164442
Price N, Moyers BT, Lopez L et al (2018) Combining population genomics and fitness QTLs to identify the genetics of local adaptation in Arabidopsis thaliana. Proc Natl Acad Sci USA 115:5028–5033. https://doi.org/10.1073/pnas.1719998115
Riedelsheimer C, Czedik-Eysenberg A, Grieder C et al (2012) Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat Genet 44:217–220. https://doi.org/10.1038/ng.1033
Runcie D, Cheng H (2019) Pitfalls and remedies for cross validation with multi-trait genomic prediction methods. G3 Genes Genomes Genet 9:3727–3741. https://doi.org/10.1534/g3.119.400598
Schrag TA, Westhues M, Schipprack W et al (2018) Beyond genomic prediction: combining different types of omics data can improve prediction of hybrid performance in maize. Genetics 208:1373–1385. https://doi.org/10.1534/genetics.117.300374
USDA (2019) Grain: world markets and trade competitive pricing suggests rebound in EU wheat exports
Wang S, Wei J, Li R et al (2019) Identification of optimal prediction models using multi-omic data for selecting hybrid rice. Heredity (Edinb) 123:395–406. https://doi.org/10.1038/s41437-019-0210-6
Westhues M, Schrag TA, Heuer C et al (2017) Omics-based hybrid prediction in maize. Theor Appl Genet 130:1927–1939. https://doi.org/10.1007/s00122-017-2934-0
Xu Y, Xu C, Xu S (2017) Prediction and association mapping of agronomic traits in maize using multiple omic data. Heredity (Edinb) 119:174–184. https://doi.org/10.1038/hdy.2017.27
Xu Y, Zhao Y, Wang X et al (2021) Incorporation of parental phenotypic data into multi-omic models improves prediction of yield-related traits in hybrid rice. Plant Biotechnol J 19:261–272. https://doi.org/10.1111/pbi.13458
Ye S, Li J, Zhang Z (2020) Multi-omics-data-assisted genomic feature markers preselection improves the accuracy of genomic prediction. J Anim Sci Biotechnol 11:1–12. https://doi.org/10.1186/s40104-020-00515-5
Yu G (2020) Using ggtree to visualize data on tree-like structures. Curr Protoc Bioinform 69(1):1–18. https://doi.org/10.1002/cpbi.96
Yu J, Pressoir G, Briggs WH et al (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203–208. https://doi.org/10.1038/ng1702
Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. https://doi.org/10.2202/1544-6115.1128
Zhao Y, Li Z, Liu G et al (2015) Genome-based establishment of a high-yielding heterotic pattern for hybrid wheat breeding. Proc Natl Acad Sci USA 112:15624–15629. https://doi.org/10.1073/pnas.1514547112