Evolution and dispersal of mitochondrial DNA haplogroup U5 in Northern Europe: insights from an unsupervised learning approach to phylogeography

Dana Kristjansson1,2, Jon Bohlin1, Trung Ngoc Nguyen3, Astanand Jugessur1,2, Theodore G. Schurr4
1Center for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
2Department of Global Public Health and Primary Care, Faculty of Medicine, University of Bergen, Bergen, Norway
3IT Systems Bergen, Norwegian Institute of Public Health, Bergen, Norway
4Department of Anthropology, University of Pennsylvania, Philadelphia, USA

Tóm tắt

Abstract Background We combined an unsupervised learning methodology for analyzing mitogenome sequences with maximum likelihood (ML) phylogenetics to make detailed inferences about the evolution and diversification of mitochondrial DNA (mtDNA) haplogroup U5, which appears at high frequencies in northern Europe. Methods Haplogroup U5 mitogenome sequences were gathered from GenBank. The hierarchal Bayesian Analysis of Population Structure (hierBAPS) method was used to generate groups of sequences that were then projected onto a rooted maximum likelihood (ML) phylogenetic tree to visualize the pattern of clustering. The haplogroup statuses of the individual sequences were assessed using Haplogrep2. Results A total of 23 hierBAPS groups were identified, all of which corresponded to subclades defined in Phylotree, v.17. The hierBAPS groups projected onto the ML phylogeny accurately clustered all haplotypes belonging to a specific haplogroup in accordance with Haplogrep2. By incorporating the geographic source of each sequence and subclade age estimates into this framework, inferences about the diversification of U5 mtDNAs were made. Haplogroup U5 has been present in northern Europe since the Mesolithic, and spread in both eastern and western directions, undergoing significant diversification within Scandinavia. A review of historical and archeological evidence attests to some of the population interactions contributing to this pattern. Conclusions The hierBAPS algorithm accurately grouped mitogenome sequences into subclades in a phylogenetically robust manner. This analysis provided new insights into the phylogeographic structure of haplogroup U5 diversity in northern Europe, revealing a detailed perspective on the diversity of subclades in this region and their distribution in Scandinavian populations.

Từ khóa


Tài liệu tham khảo

Soares P, Ermini L, Thomson N, Mormina M, Rito T, Röhl A, et al. Correcting for purifying selection: an improved human mitochondrial molecular clock. Am J Hum Genet. 2009;84(6):740–59.

van Oven M, Kayser M. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum Mutat. 2009;30(2):E386-94.

Vianello D, Sevini F, Castellani G, Lomartire L, Capri M, Franceschi C. HAPLOFIND: a new method for high-throughput mtDNA haplogroup assignment. Hum Mutat. 2013;34(9):1189–94.

Jagadeesan A, Ebenesersdóttir SS, Guðmundsdóttir VB, Thordardottir EL, Moore KHS, Helgason A. HaploGrouper: a generalized approach to haplogroup classification. Bioinformatics. 2021;37(4):570–2.

Weissensteiner H, Pacher D, Kloss-Brandstätter A, Forer L, Specht G, Bandelt HJ, et al. HaploGrep 2: mitochondrial haplogroup classification in the era of high-throughput sequencing. Nucleic Acids Res. 2016;44(W1):W58-63.

Eltsov N, Volodko N. mtPhyl-software tool for human mtDNA analysis and phylogeny reconstruction. [Internet]. 2009 [cited 2022 Mar 17]. Available from: https://sites.google.com/site/mtphyl/home

Röck AW, Dür A, Van Oven M, Parson W. Concept for estimating mitochondrial DNA haplogroups using a maximum likelihood approach (EMMA). Forensic Sci Int Genet. 2013;7(6).

Kong S, Sánchez-Pacheco SJ, Murphy RW. On the use of median-joining networks in evolutionary biology. Cladistics. 2016;32(6):691–9.

Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol Biol Evol. 2015;32(1):268–74.

Hoang DT, Chernomor O, Von Haeseler A, Minh BQ, Vinh LS. UFBoot2: Improving the ultrafast bootstrap approximation. Mol Biol Evol. 2018;35(2):518–22.

Malyarchuk B, Litvinov A, Derenko M, Skonieczna K, Grzybowski T, Grosheva A, et al. Mitogenomic diversity in Russians and Poles. Forensic Sci Int Genet. 2017;30.

Davidovic S, Malyarchuk B, Aleksic J, Derenko M, Topalovic V, Litvinov A, et al. Mitochondrial super-haplogroup U diversity in Serbians. Ann Hum Biol. 2017;44(5).

Sahakyan H, Kashani BH, Tamang R, Kushniarevich A, Francis A, Costa MD, et al. Origin and spread of human mitochondrial DNA haplogroup U7. Sci Rep. 2017;7.

Malyarchuk B, Derenko M, Denisova G, Litvinov A, Rogalla U, Skonieczna K, et al. Whole mitochondrial genome diversity in two Hungarian populations. Mol Genet Genomics. 2018;293(5).

Davidovic S, Malyarchuk B, Grzybowski T, Aleksic JM, Derenko M, Litvinov A, et al. Complete mitogenome data for the Serbian population: the contribution to high-quality forensic databases. Int J Legal Med. 2020;134(5).

Cheng L, Connor TR, Sirén J, Aanensen DM, Corander J. Hierarchical and spatially explicit clustering of DNA sequences with BAPS software. Mol Biol Evol. 2013;30(5):1224–8.

Smith JT, Amador S, McGonagle CJ, Needle D, Gibson R, Andam CP. Population genomics of Staphylococcus pseudintermedius in companion animals in the United States. Commun Biol. 2020;3(1):1–11.

Suárez-Esquivel M, Hernández-Mora G, Ruiz-Villalobos N, Barquero-Calvo E, Chacón-Díaz C, Ladner JT, et al. Persistence of brucella abortus lineages revealed by genomic characterization and phylodynamic analysis. PLoS Negl Trop Dis. 2020;14(4): e0008235.

Tonkin-Hill G, Lees JA, Bentley SD, Frost SDW, Corander J. RhierBAPs: An R implementation of the population clustering algorithm hierbaps. Wellcome Open Res [Internet]. 2018 [cited 2021 Jun 21];3(93). Available from: /pmc/articles/PMC6178908/

van Hal SJ, Willems RJL, Gouliouris T, Ballard SA, Coque TM, Hammerum AM, et al. The interplay between community and hospital Enterococcus faecium clones within health-care settings: a genomic analysis. The Lancet Microbe. 2022;3(2):e133–41.

Posth C, Renaud G, Mittnik A, Drucker DG, Rougier H, Cupillard C, et al. Pleistocene mitochondrial genomes suggest a single major dispersal of non-africans and a late glacial population turnover in Europe. Curr Biol. 2016;26(6):827–33.

Günther T, Malmström H, Svensson EM, Omrak A, Sánchez-Quinto F, Kılınç GM, et al. Population genomics of Mesolithic Scandinavia: Investigating early postglacial migration routes and high-latitude adaptation. Barton N, editor. PLOS Biol [Internet]. 2018 Jan 9 [cited 2021 Jan 11];16(1):e2003703. Available from: https://dx.plos.org/https://doi.org/10.1371/journal.pbio.2003703

Juras A, Chyleński M, Ehler E, Malmström H, Żurkiewicz D, Włodarczak P, et al. Mitochondrial genomes reveal an east to west cline of steppe ancestry in Corded Ware populations. Sci Rep. 2018;8(1).

Behar DM, Van Oven M, Rosset S, Metspalu M, Loogväli EL, Silva NM, et al. A “copernican” reassessment of the human mitochondrial DNA tree from its root. Am J Hum Genet. 2012;90(4):675–84.

Malyarchuk B, Derenko M, Grzybowski T, Perkova M, Rogalla U, Vanecek T, et al. The peopling of Europe from the mitochondrial haplogroup U5 perspective. PLoS One. 2010;5(4):10285.

Richards MB, Macaulay VA, Bandelt H-J, Sykes BC. Phylogeography of mitochondrial DNA in western Europe. Ann Hum Genet. 1998;62(3):241–60.

Richards M, Macaulay V, Hickey E, Vega E, Sykes B, Guida V, et al. Tracing european founder lineages in the near eastern mtDNA pool. Am J Hum Genet. 2000;67(5):1251–76.

Bramanti B, Thomas MG, Haak W, Unterlaender M, Jores P, Tambets K, et al. Genetic discontinuity between local hunter-gatherers and central Europe’s first farmers. Science (80- ). 2009;326(5949):137–41.

Röhl A, Brinkmann B, Forster L, Forster P. An annotated mtDNA database. Int J Legal Med. 2001;115(1).

Dupuy BM, Olaisen B. mtDNA sequences in the Norwegian Saami and main populations. In: Carracedo A., Brinkmann B. BW, editor. Advances in Forensic Haemogenetics [Internet]. 1st ed. Berlin, Heidelberg: Springer; 1996 [cited 2021 Feb 4]. p. 23–5. Available from: https://link.springer.com/chapter/https://doi.org/10.1007/978-3-642-80029-0_6

Sajantila A, Lahermo P, Anttinen T, Lukka M, Sistonen P, Savontaus ML, et al. Genes and languages in Europe: An analysis of mitochondrial lineages. Genome Res. 1995;5(1):42–52.

Tambets K, Rootsi S, Kivisild T, Help H, Serk P, Loogväli EL, et al. The western and eastern roots of the Saami - the story of genetic “outliers” told by mitochondrial DNA and Y chromosomes. Am J Hum Genet. 2004;74(4):661–82.

Kristjansson D, Bohlin J, Jugessur A, Schurr TG. Matrilineal diversity and population history of Norwegians. Am J Phys Anthropol. 2021;176:120–33.

Simoni L, Calafell F, Pettener D, Bertranpetit J, Barbujani G. Geographic patterns of mtDNA diversity in Europe. Am J Hum Genet. 2000;66(1):262–78.

Lahermo P, Sajantila A, Sistonen P, Lukka M, Aula P, Peltonen L, et al. The genetic relationship between the Finns and the Finnish Saami (Lapps): Analysis of nuclear DNA and mtDNA. Am J Hum Genet. 1996;58(6):1309–22.

Malyarchuk B, Derenko M, Grzybowski T, Lunkina A, Czarny J, Rychkov S, et al. Differentiation of mitochondrial DNA and Y chromosomes in Russian populations. Hum Biol. 2004;76(6):877–900.

Meinilä M, Finnilä S, Majamaa K. Evidence for mtDNA admixture between the Finns and the Saami. Hum Hered. 2001;52(3):160–70.

Ingman M, Gyllensten U. A recent genetic link between Sami and the Volga-Ural region of Russia. Eur J Hum Genet. 2007;15(1):115–20.

Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003;52(5):696–704.

Darriba D, Taboada GL, Doallo R, Posada D. JModelTest 2: More models, new heuristics and parallel computing. Vol. 9, Nature Methods. 2012. p. 772.

Anisimova M, Gascuel O. Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst Biol. 2006;55(4):539–52.

Minh BQ, Nguyen MAT, Von Haeseler A. Ultrafast approximation for phylogenetic bootstrap. Mol Biol Evol. 2013;30(5):1188–95.

Corander J, Marttinen P, Sirén J, Tang J. Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations. BMC Bioinformatics. 2008;9.

Treangen TJ, Ondov BD, Koren S, Phillippy AM. The harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes. Genome Biol. 2014;15(11):1–15.

Rambaut A, Lam TT, Carvalho LM, Pybus OG. Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen). Virus Evol. 2016;2(1).

To TH, Jung M, Lycett S, Gascuel O. Fast Dating Using Least-Squares Criteria and Algorithms. Syst Biol. 2016;65(1).

Översti S, Majander K, Salmela E, Salo K, Arppe L, Belskiy S, et al. Human mitochondrial DNA lineages in Iron-Age Fennoscandia suggest incipient admixture and eastern introduction of farming-related maternal ancestry. Sci Rep. 2019;9(1):1–14.

Mittnik A, Wang CC, Pfrengle S, Daubaras M, Zariņa G, Hallgren F, et al. The genetic prehistory of the Baltic Sea region. Nat Commun. 2018;9(1):1–11.

FamilyTreeDNA. FamilyTreeDNA - The U5 Project. FamilyTreeDNA. 2021.

RStudio Team. RStudio: Integrated Development for R. [Internet]. Boston, MA: RStudio, PBC; 2020. Available from: http://www.rstudio.com/.

Pala M, Achilli A, Olivieri A, Kashani BH, Perego UA, Sanna D, et al. Mitochondrial Haplogroup U5b3: A Distant Echo of the Epipaleolithic in Italy and the Legacy of the Early Sardinians. Am J Hum Genet. 2009;84(6).

Achilli A, Rengo C, Battaglia V, Pala M, Olivieri A, Fornarino S, et al. Saami and Berbers - An unexpected mitochondrial DNA link. Am J Hum Genet. 2005;76(5):883–6.

Rosa A, Brehm A, Kivisild T, Metspalu E, Villems R. MtDNA profile of West Africa Guineans: Towards a better understanding of the Senegambia region. Ann Hum Genet. 2004;68(4):340–52.

Hadley DM. Viking and native: Re-thinking identity in the Danelaw. Early Mediev Eur. 2002;11(1):45–70.

Kittles RA, Bergen AW, Urbanek M, Virkkunen M, Linnoila M, Goldman D, et al. Autosomal, mitochondrial, and Y chromosome DNA variation in Finland: Evidence for a male-specific bottleneck. Am J Phys Anthropol. 1999;108(4):381–99.

Tvauri A. Migrants or Natives? The Research History of Long Barrows in Russia and Estonia in the 5th–10th Centuries. 32nd ed. Nuorluoto J, editor. Vol. 32, Slavica Helsingiensia. Helsinki: University of Helsinki; 2007. 247–285 p.

Helgason A, Hickey E, Goodacre S, Bosnes V, Stefánsson K, Ward R, et al. mtDNA and the Islands of the North Atlantic: Estimating the proportions of Norse and Gaelic ancestry. Am J Hum Genet. 1998;68(3):723–37.

Opdal SHS, Rognum TOT, Vege Å, Stave AKA, Dupuy BMB, Egeland T. Increased number of substitutions in the D-loop of mitochondrial DNA in the sudden infant death syndrome. Acta Paediatr Int J Paediatr. 1998;87(10):1039–44.

Passarino G, Cavalleri GL, Lin AA, Cavalli-Sforza LL, Børresen-Dale AL, Underhill PA. Different genetic components in the Norwegian population revealed by the analysis of mtDNA and Y chromosome polymorphisms. Eur J Hum Genet. 1998;10(9):521–9.

The Norway DNA Project Group. FamilyTreeDNA - The Norway DNA - Norge Project [Internet]. FamilyTreeDNA. 1998 [cited 2020 Jun 11]. Available from: https://www.familytreedna.com/group-join.aspx?Group=Norway

Delghandi M, Utsi E, Krauss S. Saami mitochondrial DNA reveals deep maternal lineage clusters. Hum Hered. 1998;48(2).

Kivisild T, Saag L, Hui R, Biagini SA, Pankratov V, D’Atanasio E, et al. Patterns of genetic connectedness between modern and medieval Estonian genomes reveal the origins of a major ancestry component of the Finnish population. Am J Hum Genet. 2021;108(9).

Kay C, Williams TA, Gibson W. Mitochondrial DNAs provide insight into trypanosome phylogeny and molecular evolution. BMC Evol Biol. 2020;20(1).

Smýkal P, Kenicer G, Flavell AJ, Corander J, Kosterin O, Redden RJ, et al. Phylogeny, phylogeography and genetic diversity of the Pisum genus. Plant Genet Resour Characterisation Util. 2011;9(1):4–18.

Afzal-Rafii Z, Dodd RS. Chloroplast DNA supports a hypothesis of glacial refugia over postglacial recolonization in disjunct populations of black pine (Pinus nigra) in western Europe. Mol Ecol. 2007;16(4):723–36.

Zachos FE, Frantz AC, Kuehn R, Bertouille S, Colyn M, Niedziałkowska M, et al. Genetic structure and effective population sizes in european red deer (Cervus elaphus) at a continental scale: insights from microsatellite DNA. J Hered. 2016;107(4):318–26.

Besaggio D, Fuselli S, Srikummool M, Kampuansai J, Castrì L, Tyler-Smith C, et al. Genetic variation in Northern Thailand Hill Tribes: Origins and relationships with social structure and linguistic differences. BMC Evol Biol. 2007;7(SUPPL. 2):1–10.

Gonçalves VF, Carvalho CMB, Bortolini MC, Bydlowski SP, Pena SDJ. The phylogeography of African Brazilians. Hum Hered. 2007;65(1):23–32.

Torroni A, Bandelt HJ, Macaulay V, Richards M, Cruciani F, Rengo C, et al. A signal, from human mtDNA, of postglacial recolonization in Europe. Am J Hum Genet. 2001;69(4):844–52.

Hughes ALC, Gyllencreutz R, Lohne ØS, Mangerud J, Svendsen JI. The last Eurasian ice sheets - a chronological database and time-slice reconstruction, DATED-1. Boreas. 2016;45(1).

Dolukhanov PM. Modern Humans’ Expansion in Eurasia: One Flew East. Open Anthropol J. 2008;1(1):26–32.

Stroeven AP, Hättestrand C, Kleman J, Heyman J, Fabel D, Fredin O, et al. Deglaciation of Fennoscandia. Quat Sci Rev. 2016;1(147):91–121.

Stroeven AP, Heyman J, Fabel D, Björck S, Caffee MW, Fredin O, et al. A new Scandinavian reference 10Be production rate. Quat Geochronol. 2015;29.

Ross AB, Johansson Å, Ingman M, Gyllensten U. Lifestyle, genetics, and disease in Sami [Internet]. Vol. 47, Croatian Medical Journal. Medicinska Naklada; 2006 [cited 2021 Jul 5]. p. 553–65. Available from: www.cmj.hr

Laan M, Pääbo S. Demographic history and linkage disequilibrium in human populations. Nat Genet. 1997;17(4).

Kaessmann H, Zöllner S, Gustafsson AC, Wiebe V, Laan M, Lundeberg J, et al. Extensive linkage disequilibrium in small human populations in Eurasia. Am J Hum Genet. 2002;70(3).

Johansson Å, Vavruch-Nilsson V, Edin-Liljegren A, Sjölander P, Gyllensten U. Linkage disequilibrium between microsatellite markers in the Swedish Sami relative to a worldwide selection of populations. Hum Genet. 2005;116(1–2).

Sajantila A, Salem AH, Savolainen P, Bauer K, Gierig C, Pääbo S. Paternal and maternal DNA lineages reveal a bottleneck in the founding of the Finnish population. Proc Natl Acad Sci U S A. 1996;93(21).

Tambets K, Yunusbayev B, Hudjashov G, Ilumäe AM, Rootsi S, Honkola T, et al. Genes reveal traces of common recent demographic history for most of the Uralic-speaking populations. Genome Biol. 2018;19(1):1–20.

Huyghe JR, Fransen E, Hannula S, Van Laer L, Van Eyken E, Mäki-Torkko E, et al. A genome-wide analysis of population structure in the Finnish Saami with implications for genetic association studies. Eur J Hum Genet. 2011;19(3).

Diamond J, Bellwood P. Farmers and their languages: The first expansions. Vol. 300, Science. 2003.

Brandt G, Haak W, Adler CJ, Roth C, Szécsényi-Nagy A, Karimnia S, et al. Ancient DNA reveals key stages in the formation of Central European mitochondrial genetic diversity. Science (80- ). 2013;342(6155):257–61.

Haak W, Forster P, Bramanti B, … SM-, 2005 U. Ancient DNA from the first European farmers in 7500-year-old Neolithic sites. science.sciencemag.org [Internet]. 2005 [cited 2021 Jul 2];310(5750):1016–8. Available from: https://science.sciencemag.org/content/310/5750/1016.abstract

Malyarchuk BA. Adaptive evolution signals in mitochondrial genes of Europeans. Biochem. 2011;76(6).

Golubenko M V., Salakhov RR, Makeeva OA, Goncharova IA, Kashtalap V V., Barbarash OL, et al. Association of mitochondrial DNA polymorphism with myocardial infarction and prognostic signs for atherosclerosis. Mol Biol. 2015;49(6).

Montiel-Sosa F, Ruiz-Pesini E, Enríquez JA, Marcuello A, Díez-Sánchez C, Montoya J, et al. Differences of sperm motility in mitochondrial DNA haplogroup U sublineages. Gene. 2006;368(1–2).

Majamaa K, Turkka J, Kärppä M, Winqvist S, Hassinen IE. The common MELAS mutation A3243G in mitochondrial DNA among young patients with an occipital brain infarct. Neurology. 1997;49(5).

Finnilä S, Hassinen IE, Ala-Kokko L, Majamaa K. Phylogenetic network of the mtDNA haplogroup U in northern Finland based on sequence analysis of the complete coding region by conformation-sensitive gel electrophoresis. Am J Hum Genet. 2000;66(3).