Optimization of Selective Phenotyping and Population Design for Genomic Prediction

Nicolas Heslot1, Vitaliy Feoktistov1
1Biostatistics Department, Limagrain Field Seeds Research, Chappes Research Center, Chappes, France

Tóm tắt

Genomic prediction, the joint analysis of high-density molecular marker data and phenotype to predict the performance of individuals for breeding purpose, is now a method used in routine in many plant and animal breeding programs. This opens several new design questions such as how to select a subset of preexisting individuals for phenotyping based on the molecular marker data to estimate marker effects with the highest precision, in hybrid species, how to choose the hybrids combination to create and phenotype to best predict the performance of the unobserved hybrid combinations and last from a list of individuals, which new populations of individuals to create to optimize marker effects estimation with a budget constraint. Those three designs questions are interrelated and critical to improve the efficiency of breeding. In this article we present efficient optimization methods to answer those three designs questions. Validation results using real data and simulations are presented. Results show that in several situations significant gain in precision of evaluation of selection candidates and marker effects are possible to help increase further the efficiency of plant breeding.

Tài liệu tham khảo

Akdemir, D. (2017), “STPGA: Selection of training populations with a genetic algorithm,” bioRxiv, . http://www.biorxiv.org/content/early/2017/02/27/111989 Akdemir, D., Sanchez, J. I., and Jannink, J.-L. (2015), “Optimization of genomic selection training populations with a genetic algorithm.,” Genetics, selection, evolution, 47(1), 38. http://www.gsejournal.org/content/47/1/38 Albrecht, T., Auinger, H.-J., Wimmer, V., Ogutu, J. O., Knaak, C., Ouzunova, M., Piepho, H., and Schön, C.-C. (2014), “Genome-based prediction of maize hybrid performance across genetic groups, testers, locations, and years.,” Theoretical and applied genetics, . http://www.ncbi.nlm.nih.gov/pubmed/24723140 Asoro, F., Newell, M. a., Beavis, W. D., Scott, M. P., Tinker, N. a., and Jannink, J.-L. (2013), “Genomic, Marker-Assisted, and Pedigree-BLUP Selection Methods for \(\beta \)-Glucan Concentration in Elite Oat,” Crop Science, 53(5), 1894–1906. https://www.crops.org/publications/cs/abstracts/53/5/1894 Auinger, H.-J., Schonleben, M., Lehermeier, C., Schmidt, M., Korzun, V., Geiger, H. H., Piepho, H.-P., Gordillo, A., Wilde, P., Bauer, E., and Schon, C.-C. (2016), “Model training across multiple breeding cycles significantly improves genomic prediction accuracy in rye (Secale cereale L.),” Theoretical and Applied Genetics, . http://link.springer.com/10.1007/s00122-016-2756-5 Bernardo, R. (2016), “Bandwagons I, too, have known,” Theoretical and Applied Genetics, 129(12), 2323–2332. https://doi.org/10.1007/s00122-016-2772-5 Berro, I., Lado, B., Nalin, R. S., Quincke, M., and Gutierrez, L. (2019), “Training Population Optimization for Genomic Selection,” The Plant Genome, 12(3), 190028. https://acsess.onlinelibrary.wiley.com/doi/abs/10.3835/plantgenome2019.04.0028 Bustos-Korts, D., Malosetti, M., Chapman, S., Biddulph, B., and van Eeuwijk, F. (2016), “Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space,” G3: Genes, Genomes, Genetics,. http://www.g3journal.org/content/early/2016/09/22/g3.116.035410 Butler, D., Cullis, B. R., and Gilmour, A. (2007), Asreml-R : an R package for mixed models using residual maximum likelihood. Butler, D., Smith, A. B., and Cullis, B. R. ( 2014), “On the Design of Field Experiments with Correlated Treatment Effects,” Journal of Agricultural, Biological, and Environmental Statistics, . http://link.springer.com/10.1007/s13253-014-0191-0 Combs, E., and Bernardo, R. (2013), “Genomewide selection to introgress semidwarf maize germplasm into U.S. corn belt inbreds,” Crop Science, 53(4), 1427–1436. https://www.crops.org/publications/cs/abstracts/53/4/1427 de S. Bueno Filho, J. S., and Gilmour, S. G. (2003), “Planning Incomplete Block Experiments When Treatments Are Genetically Related,” Biometrics, 59(2), 375–381. https://onlinelibrary.wiley.com/doi/abs/10.1111/1541-0420.00044 Endelman, J. B. (2011), “Ridge regression and other kernels for genomic selection with R package rrBLUP,” The Plant Genome Journal, 4(3), 250. https://www.crops.org/publications/tpg/abstracts/4/3/250 Endelman, J. B., and Jannink, J.-L. (2012), “Shrinkage Estimation of the Realized Relationship Matrix,” G3: Genes, Genomes, Genetics, 2(11), 1405–1413. http://www.g3journal.org/content/2/11/1405 Feoktistov, V. (2006), Differential Evolution: In Search of Solutions, Vol. 5, New York: Springer USA. http://www.springer.com/mathematics/book/978-0-387-36895-5 Feoktistov, V., and Janaqi, S. (2004), Generalization of the strategies in differential evolution,, in 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings., pp. 165–. Feoktistov, V., Pietravalle, S., and Heslot, N. (2017), “Optimal Experimental Design of Field Trials using Differential Evolution,” arXiv:1702.00815 Gibson, D. (2005), The Art of Mixing: A Visual Guide to Recording, Engineering, and Production 2nd Edition, New York: Artistpro. Habier, D. (2015), “Improved molecular breeding methods,” International patent application, WO2015/100236 A1. Henderson, C. R. (1984), Applications of linear models in animal breeding, Guelph, Ontario: University of Guelph. Heslot, N., and Jannink, J.-L. (2015) , “An alternative covariance estimator to investigate genetic heterogeneity in populations,” Genetics Selection Evolution, 47(1), 93. http://www.gsejournal.org/content/47/1/93 Heslot, N., Sorrells, M. E., and Jannink, J.-L. (2015), “Perspectives for genomic selection applications and research in plants,” Crop Science, 55(12), 1–12. Hickey, J. M., Dreisigacker, S., Crossa, J., Hearne, S., Babu, R., Prasanna, B. M., Grondona, M., Zambelli, A., Windhausen, V. S., Mathews, K., and Gorjanc, G. (2014), “Evaluation of Genomic Selection Training Population Designs and Genotyping Strategies in Plant Breeding Programs Using Simulation,” Crop Science, 54(4), 1476–1488. https://acsess.onlinelibrary.wiley.com/doi/abs/10.2135/cropsci2013.03.0195 Isidro, J., Jannink, J.-L., Akdemir, D., Poland, J., Heslot, N., and Sorrells, M. E. (2015), “Training set optimization under population structure in genomic selection,” Theoretical and Applied Genetics, pp. 145–158. Klasen, J. R., Piepho, H. P., and Stich, B. (2012), “QTL detection power of multi-parental RIL populations in Arabidopsis thaliana,” Heredity, 108(6), 1365–2540. https://doi.org/10.1038/hdy.2011.133 Laloë, D. (1993), “Precision and information in linear models of genetic evaluation,” Genetics Selection Evolution, 25(6), 556–576. http://www.biomedcentral.com/content/pdf/1297-9686-25-6-557.pdf Lehermeier, C., Teyssèdre, S., and Schön, C.-C. (2017), “Genetic Gain Increases by Applying the Usefulness Criterion with Improved Variance Prediction in Selection of Crosses,” Genetics, 207(4), 1651–1661. https://www.genetics.org/content/207/4/1651 Li, H., Linderman, G. C., Szlam, A., Stanton, K. P., Kluger, Y., and Tygert, M. (2017), “Algorithm 971: An Implementation of a Randomized Algorithm for Principal Component Analysis,” ACM Trans. Math. Softw., 43(3), 28:1–28:14. 10.1145/3004053 Marulanda, J. J., Melchinger, A. E., and Wurschum, T. (2015), “Genomic selection in biparental populations: assessment of parameters for optimum estimation set design,” Plant Breeding, 134(6), 623–630. https://onlinelibrary.wiley.com/doi/abs/10.1111/pbr.12317 Marulanda, J. J., Mi, X., Melchinger, A. E., Xu, J.-L., Würschum, T., and Longin, C. F. H. (2016), “Optimum breeding strategies using genomic selection for hybrid breeding in wheat, maize, rye, barley, rice and triticale,” Theoretical and Applied Genetics, 129, 1901. http://link.springer.com/10.1007/s00122-016-2748-5 Massman, J. M., Jung, H.-J. G., and Bernardo, R. (2013), “Genomewide Selection versus Marker-assisted Recurrent Selection to Improve Grain Yield and Stover-quality Traits for Cellulosic Ethanol in Maize,” Crop Science, 53(1), 58–66. https://www.crops.org/publications/cs/abstracts/53/1/58 Meuwissen, T. H. E., Hayes, B. J., and Goddard, M. E. (2001), “Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps,” Genetics, 157(4), 1819–1829. http://www.genetics.org/content/157/4/1819 Mohammadi, M., Tiede, T., and Smith, K. P. (2015), “PopVar: A Genome-Wide Procedure for Predicting Genetic Variance and Correlated Response in Biparental Breeding Populations,” Crop Science, 55(5), 2068–2077. https://acsess.onlinelibrary.wiley.com/doi/abs/10.2135/cropsci2015.01.0030 R Development Core Team (2012), R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. Riedelsheimer, C., Endelman, J. B., Stange, M., Sorrells, M. E., Jannink, J.-L., and Melchinger, A. E. (2013), “Genomic predictability of interconnected biparental maize populations.,” Genetics, 194(2), 493–503. http://www.ncbi.nlm.nih.gov/pubmed/23535384 Rincent, R., Laloë, D., Nicolas, S., Altmann, T., Brunel, D., Revilla, P., Rodríguez, V., Moreno-Gonzalez, J., Melchinger, A., Bauer, E., Schoen, C.-C., Meyer, N., Giauffret, C., Bauland, C., Jamin, P., Laborde, J., Monod, H., Flament, P., Charcosset, A., and Moreau, L. (2012), “Maximizing the Reliability of Genomic Selection by Optimizing the Calibration Set of Reference Individuals: Comparison of Methods in Two Diverse Groups of Maize Inbreds (Zea mays L.),” Genetics, 192(2), 715–728. http://www.genetics.org/content/192/2/715 Rutkoski, J., Poland, J., and Singh, R. (2014), “Genomic selection for quantitative adult plant stem rust resistance in wheat,” The Plant Genome, 7, 1–44. https://www.crops.org/publications/tpg/abstracts/7/3/plantgenome2014.02.0006 Servin, B., Martin, O. C., Mézard, M., and Hospital, F. (2004), “Toward a Theory of Marker-Assisted Gene Pyramiding,” Genetics, 168(1), 513–523. http://www.genetics.org/content/168/1/513 Stich, B. (2009), “Comparison of Mating Designs for Establishing Nested Association Mapping Populations in Maize and Arabidopsis thaliana,” Genetics, 183(4), 1525–1534. http://www.genetics.org/content/183/4/1525 VanRaden, P. M. (2008), “Efficient methods to compute genomic predictions.,” Journal of dairy science, 91(11), 4414–23. http://www.ncbi.nlm.nih.gov/pubmed/18946147 Vitezica, Z. G., Varona, L., and Legarra, A. (2013), “On the Additive and Dominant Variance and Covariance of Individuals Within the Genomic Selection Scope,” Genetics, 195(4), 1223–1230. https://www.genetics.org/content/195/4/1223 Williams, E., John, J., and Whitaker, D. (2014), “Construction of more Flexible and Efficient P-rep Designs,” Australian & New Zealand Journal of Statistics, 56(1), 89–96. https://onlinelibrary.wiley.com/doi/abs/10.1111/anzs.12068 Wurschum, T., Maurer, H. P., Weissmann, S., Hahn, V., and Leiser, W. L. (2017), “Accuracy of within- and among-family genomic prediction in triticale,” Plant Breeding, 136(2), 230–236. https://onlinelibrary.wiley.com/doi/abs/10.1111/pbr.12465 Yu, X., Li, X., Guo, T., Zhu, C., Wu, Y., Mitchell, S. E., Roozeboom, K. L., Wang, D., Wang, M. L., Pederson, G. A., Tesso, T. T., Schnable, P. S., Bernardo, R., and Yu, J. (2016), “Genomic prediction contributing to a promising global strategy to turbocharge gene banks,” Nature Plants, 2(10), 16150. http://www.nature.com/articles/nplants2016150