Genome sequence, comparative analysis and haplotype structure of the domestic dog

Nature - Tập 438 Số 7069 - Trang 803-819 - 2005
Kerstin Lindblad‐Toh1, Claire M. Wade1, Tarjei S. Mikkelsen1, Elinor K. Karlsson1, David B. Jaffe1, Michael Kamal1, Michèle Clamp1, Jean L. Chang1, Edward J. Kulbokas1, Michael C. Zody1, Manfred Grabherr1, Hong Xue1, Matthew Breen2, Robert K. Wayne3, Elaine A. Ostrander4, Chris P. Ponting5, Francis Galibert6, Andrew R. Smith7, Pieter DeJong8, Ewen F. Kirkness9, Pablo Álvarez1, Tara Biagi1, William W. Brockman1, Jonathan A. Butler1, Chee-Wye Chin1, April Cook1, James Cuff1, Mark J. Daly10, David DeCaprio1, Sante Gnerre1, M Kellis11, Michael Kleber1, Carolyne Bardeleben3, Leo Goodstadt5, Andreas Heger5, Christophe Hitte6, Lisa Kim4, Klaus‐Peter Koepfli3, Heidi G. Parker4, John P. Pollinger3, Stephen M. J. Searle12, Nathan B. Sutter4, Rachael Thomas2, Caleb Webber5, Sı́lvia Beà13
1Broad Institute of Harvard and MIT, 320 Charles Street, Massachusetts, 02141, Cambridge, USA
2Department of Molecular Biomedical Sciences, College of Veterinary Medicine, North Carolina State University, 4700 Hillsborough Street, North Carolina, 27606, Raleigh, USA
3Department of Ecology and Evolutionary Biology, University of California, California, 90095, Los Angeles, USA
4National Human Genome Research Institute, National Institutes of Health, 50 South Drive, MSC 8000, Building 50, Maryland, 20892-8000, Bethesda, USA
5Department of Human Anatomy and Genetics, MRC Functional Genetics, University of Oxford, South Parks Road, OX1 3QX, Oxford, UK
6UMR 6061 Genetique et Developpement, CNRS—Université de Rennes 1, Faculté de Médecine, 2, Avenue Leon Bernard, 35043, Rennes Cedex, France
7Agencourt Bioscience Corporation, 500 Cummings Center, Suite 2450, Massachusetts, 01915, Beverly, USA
8Children's Hospital Oakland Research Institute, 5700 Martin Luther King Jr Way, California, 94609, Oakland, USA
9The Institute for Genomic Research, Maryland, 20850, Rockville, USA
10Center for Human Genetic Research, Massachusetts General Hospital, 185 Cambridge Street, Massachusetts, 02114, Boston, USA
11Computer Science and Artificial Intelligence Laboratory, Massachusetts, 02139, Cambridge, USA
12The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, CB10 1SA, Cambridge, UK
13Whitehead Institute for Biomedical Research, 9 Cambridge Center, Massachusetts, 02142, Cambridge, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Wayne, R. K. et al. Molecular systematics of the Canidae. Syst. Biol. 46, 622–653 (1997)

Vila, C. et al. Multiple and ancient origins of the domestic dog. Science 276, 1687–1689 (1997)

Bardeleben, C., Moore, R. L. & Wayne, R. K. Isolation and molecular evolution of the selenocysteine tRNA (Cf TRSP) and RNase P RNA (Cf RPPH1) genes in the dog family, Canidae. Mol. Biol. Evol. 22, 347–359 (2005)

Savolainen, P., Zhang, Y. P., Luo, J., Lundeberg, J. & Leitner, T. Genetic evidence for an East Asian origin of domestic dogs. Science 298, 1610–1613 (2002)

American Kennel Club. The Complete Dog Book (eds Crowley, J. & Adelman, B.) (Howell Book House, New York, 1998)

Wayne, R. K. Limb morphology of domestic and wild canids: the influence of development on morphologic change. J. Morphol. 187, 301–319 (1986)

Ostrander, E. A., Galibert, F. & Patterson, D. F. Canine genetics comes of age. Trends Genet. 16, 117–123 (2000)

Patterson, D. Companion animal medicine in the age of medical genetics. J. Vet. Intern. Med. 14, 1–9 (2000)

Sargan, D. R. IDID: inherited diseases in dogs: web-based information for canine inherited disease genetics. Mamm. Genome 15, 503–506 (2004)

Chase, K. et al. Genetic basis for systems of skeletal quantitative traits: principal component analysis of the canid skeleton. Proc. Natl Acad. Sci. USA 99, 9930–9935 (2002)

Breen, M. et al. Chromosome-specific single-locus FISH probes allow anchorage of an 1800-marker integrated radiation-hybrid/linkage map of the domestic dog genome to all chromosomes. Genome Res. 11, 1784–1795 (2001)

Breen, M., Bullerdiek, J. & Langford, C. F. The DAPI banded karyotype of the domestic dog (Canis familiaris) generated using chromosome-specific paint probes. Chromosome Res. 7, 401–406 (1999)

Breen, M. et al. An integrated 4249 marker FISH/RH map of the canine genome. BMC Genomics 5, 65 (2004)

Hitte, C. et al. Facilitating genome navigation: survey sequencing and dense radiation-hybrid gene mapping. Nature Rev. Genet. 6, 643–648 (2005)

Li, R. et al. Construction and characterization of an eightfold redundant dog genomic bacterial artificial chromosome library. Genomics 58, 9–17 (1999)

Kirkness, E. F. et al. The dog genome: survey sequencing and comparative analysis. Science 301, 1898–1903 (2003)

Sutter, N. & Ostrander, E. Dog star rising: The canine genetic system. Nature Rev. Genet. 5, 900–910 (2004)

Galibert, F., Andre, C. & Hitte, C. Dog as a mammalian genetic model [in French]. Med. Sci. (Paris) 20, 761–766 (2004)

Pollinger, J. P. et al. Selective sweep mapping of genes with large phenotypic effects. Genome Res. doi:10.1101/gr.4374505 (in the press)

Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001)

Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)

Venter, J. C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001)

The Chimpanzee Sequencing and Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87 (2005)

Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)

Rat Genome Sequencing Project Consortium. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428, 493–521 (2004)

Murphy, W. J. et al. Molecular phylogenetics and the origins of placental mammals. Nature 409, 614–618 (2001)

Thomas, J. W. et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424, 788–793 (2003)

Margulies, E. H. et al. An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing. Proc. Natl Acad. Sci. USA 102, 4795–4800 (2005)

Boffelli, D. et al. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299, 1391–1394 (2003)

Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004)

Eddy, S. R. A model of the statistical power of comparative genome sequence analysis. PLoS Biol. 3, e10 (2005)

Xie, X. et al. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434, 338–345 (2005)

Dermitzakis, E. T. et al. Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment. Genome Res. 14, 852–859 (2004)

Jaffe, D. B. et al. Whole-genome sequence assembly for mammalian genomes: Arachne 2. Genome Res. 13, 91–96 (2003)

International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004)

Richterich, P. Estimation of errors in “raw” DNA sequences: a validation study. Genome Res. 8, 251–259 (1998)

Bailey, J. A., Baertsch, R., Kent, W. J., Haussler, D. & Eichler, E. E. Hotspots of mammalian chromosomal evolution. Genome Biol. 5, R23 (2004)

Andelfinger, G. et al. Detailed four-way comparative mapping and gene order analysis of the canine ctvm locus reveals evolutionary chromosome rearrangements. Genomics 83, 1053–1062 (2004)

Wang, W. & Kirkness, E. F. Short interspersed elements (SINEs) are a major source of canine genomic diversity. Genome Res. doi:10.1101/gr.3765505 (in the press)

Mamedov, I. Z., Arzumanyan, E. S., Amosova, A. L., Lebedev, Y. B. & Sverdlov, E. D. Whole-genome experimental identification of insertion/deletion polymorphisms of interspersed repeats by a new general approach. Nucleic Acids Res. 33, e16 (2005)

Lin, L. et al. The sleep disorder canine narcolepsy is caused by a mutation in the hypocretin (orexin) receptor 2 gene. Cell 98, 365–376 (1999)

Pele, M., Tiret, L., Kessler, J. L., Blot, S. & Panthier, J. J. SINE exonic insertion in the PTPLA gene leads to multiple splicing defects and segregates with the autosomal recessive centronuclear myopathy in dogs. Hum. Mol. Genet. 14, 1417–1427 (2005)

Fondon, J. W. III & Garner, H. R. Molecular origins of rapid and continuous morphological evolution. Proc. Natl Acad. Sci. USA 101, 18058–18063 (2004)

Galtier, N. & Mouchiroud, D. Isochore evolution in mammals: a human-like ancestral structure. Genetics 150, 1577–1584 (1998)

Belle, E. M., Duret, L., Galtier, N. & Eyre-Walker, A. The decline of isochores in mammals: an assessment of the GC content variation along the mammalian phylogeny. J. Mol. Evol. 58, 653–660 (2004)

Bird, A. P. DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res. 8, 1499–1504 (1980)

Antequera, F. & Bird, A. Number of CpG islands and genes in human and mouse. Proc. Natl Acad. Sci. USA 90, 11995–11999 (1993)

Cooper, G. M., Brudno, M., Green, E. D., Batzoglou, S. & Sidow, A. Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. Genome Res. 13, 813–820 (2003)

Hwang, D. G. & Green, P. Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution. Proc. Natl Acad. Sci. USA 101, 13994–14001 (2004)

Martin, A. P. & Palumbi, S. R. Body size, metabolic rate, generation time, and the molecular clock. Proc. Natl Acad. Sci. USA 90, 4087–4091 (1993)

Gillooly, J. F., Allen, A. P., West, G. B. & Brown, J. H. The rate of DNA evolution: effects of body size and temperature on the molecular clock. Proc. Natl Acad. Sci. USA 102, 140–145 (2005)

Laird, C. D., McConaughy, B. L. & McCarthy, B. J. Rate of fixation of nucleotide substitutions in evolution. Nature 224, 149–154 (1969)

Li, W. H., Tanimura, M. & Sharp, P. M. An evaluation of the molecular clock hypothesis using mammalian DNA sequences. J. Mol. Evol. 25, 330–342 (1987)

Webber, C. & Ponting, C. P. Hot spots of mutation and breakage in dog and human chromosomes. Genome Res. doi:10.1101/gr.3896805 (in the press)

International Chicken Genome Sequencing Consortium. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695–716 (2004)

Marques-Bonet, T. & Navarro, A. Chromosomal rearrangements are associated with higher rates of molecular evolution in mammals. Gene 353, 147–154 (2005)

Miller, W., Makova, K. D., Nekrutenko, A. & Hardison, R. C. Comparative genomics. Annu. Rev. Genomics Hum. Genet. 5, 15–56 (2004)

Smith, N. G., Brandstrom, M. & Ellegren, H. Evidence for turnover of functional noncoding DNA in mammalian genome evolution. Genomics 84, 806–813 (2004)

Woolfe, A. et al. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3, e7 (2005)

Ovcharenko, I. et al. Evolution and functional classification of vertebrate gene deserts. Genome Res. 15, 137–145 (2005)

Walter, K., Abnizova, I., Elgar, G. & Gilks, W. R. Striking nucleotide frequency pattern at the borders of highly conserved vertebrate non-coding sequences. Trends Genet. 21, 436–440 (2005)

Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005)

Nobrega, M. A., Ovcharenko, I., Afzal, V. & Rubin, E. M. Scanning human gene deserts for long-range enhancers. Science 302, 413 (2003)

Kimura-Yoshida, C. et al. Characterization of the pufferfish Otx2 cis-regulators reveals evolutionarily conserved genetic mechanisms for vertebrate head specification. Development 131, 57–71 (2004)

Uchikawa, M., Ishida, Y., Takemoto, T., Kamachi, Y. & Kondoh, H. Functional analysis of chicken Sox2 enhancers highlights an array of diverse regulatory elements that are conserved in mammals. Dev. Cell 4, 509–519 (2003)

de la Calle-Mustienes, E. et al. A functional survey of the enhancer activity of conserved non-coding sequences from vertebrate Iroquois cluster gene deserts. Genome Res. 15, 1061–1072 (2005)

Daly, M. J. Estimating the human gene count. Cell 109, 283–284 (2002)

Hogenesch, J. B. et al. A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes. Cell 106, 413–415 (2001)

Emes, R. D., Goodstadt, L., Winter, E. E. & Ponting, C. P. Comparison of the genomes of human and mouse lays the foundation of genome zoology. Hum. Mol. Genet. 12, 701–709 (2003)

Ewing, B. & Green, P. Analysis of expressed sequence tags indicates 35,000 human genes. Nature Genet. 25, 232–234 (2000)

Wolfe, K. H. & Li, W. H. Molecular evolution meets the genomics revolution. Nature Genet. 33 (suppl.), 255–265 (2003)

Bailey, J. A., Liu, G. & Eichler, E. E. An Alu transposition model for the origin and expansion of human segmental duplications. Am. J. Hum. Genet. 73, 823–834 (2003)

Hughes, A. L. The evolution of the type I interferon gene family in mammals. J. Mol. Evol. 41, 539–548 (1995)

Hurst, L. D. The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet. 18, 486 (2002)

Ohta, T. Near-neutrality in evolution of genes and gene regulation. Proc. Natl Acad. Sci. USA 99, 16134–16137 (2002)

Demetrius, L. Directionality theory and the evolution of body size. Proc. Biol. Sci. 267, 2385–2391 (2000)

Fay, J. C. & Wu, C. I. Sequence divergence, functional constraint, and selection in protein evolution. Annu. Rev. Genomics Hum. Genet. 4, 213–235 (2003)

Brunet, J. P., Tamayo, P., Golub, T. R. & Mesirov, J. P. Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl Acad. Sci. USA 101, 4164–4169 (2004)

Mootha, V. K. et al. PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genet. 34, 267–273 (2003)

Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005)

Dorus, S. et al. Accelerated evolution of nervous system genes in the origin of Homo sapiens. Cell 119, 1027–1040 (2004)

Saetre, P. et al. From wild wolf to domestic dog: gene expression changes in the brain. Brain Res. Mol. Brain Res. 126, 198–206 (2004)

Wyckoff, G. J., Wang, W. & Wu, C. I. Rapid evolution of male reproductive genes in the descent of man. Nature 403, 304–309 (2000)

Birkhead, T. R. & Pizzari, T. Postcopulatory sexual selection. Nature Rev. Genet. 3, 262–273 (2002)

Dorus, S., Evans, P. D., Wyckoff, G. J., Choi, S. S. & Lahn, B. T. Rate of molecular evolution of the seminal protein gene SEMG2 correlates with levels of female promiscuity. Nature Genet. 36, 1326–1329 (2004)

Ruiz-Pesini, E. et al. Correlation of sperm motility with mitochondrial enzymatic activities. Clin. Chem. 44, 1616–1620 (1998)

Zeh, J. A. & Zeh, D. W. Maternal inheritance, sexual conflict and the maladapted male. Trends Genet. 21, 281–286 (2005)

Grossman, L. I., Wildman, D. E., Schmidt, T. R. & Goodman, M. Accelerated evolution of the electron transport chain in anthropoid primates. Trends Genet. 20, 578–585 (2004)

Ostrander, E. A. & Kruglyak, L. Unleashing the canine genome. Genome Res. 10, 1271–1274 (2000)

Sutter, N. B. et al. Extensive and breed-specific linkage disequilibrium in Canis familiaris. Genome Res. 12, 2388–2396 (2004)

Parker, H. G. et al. Genetic structure of the purebred domestic dog. Science 304, 1160–1164 (2004)

Bardeleben, C., Moore, R. L. & Wayne, R. K. A molecular phylogeny of the Canidae based on six nuclear loci. Mol. Phylogenet. Evol. 37, 815–831 (2005)

Fogel, B. The Encyclopedia of the Dog (D.K. Publishing, New York, 1995)

Wilcox, B. & Walkowicz, C. The Atlas of Dog Breeds of the World (T.H.F. Publications, Neptune City, New York, 1995)

Frazer, K. A. et al. Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome. Genome Res. 14, 1493–1500 (2004)

Hudson, R. R. in Oxford Surveys in Evolutionary Biology Vol. 7 (eds Futuyma, D. & Antonovics, J.) 1–44 (Oxford Univ. Press, Oxford, 1990)

Vila, C., Seddon, J. & Ellegren, H. Genes of domestic mammals augmented by backcrossing with wild ancestors. Trends Genet. 21, 214–218 (2005)

Leonard, J. A. et al. Ancient DNA evidence for Old World origin of New World dogs. Science 298, 1613–1616 (2002)

Kajiwara, N. & Japanese Kennel Club in Akita (eds Kariyabu, T. & Kaluzniacki, S.) 1–103 (Japan Kennel Club, Tokyo, 1998)

Gabriel, S. B. et al. The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002)

Werner, P., Raducha, M. G., Prociuk, U., Henthorn, P. S. & Patterson, D. F. Physical and linkage mapping of human chromosome 17 loci to dog chromosomes 9 and 5. Genomics 42, 74–82 (1997)

Todhunter, R. J. et al. Power of a Labrador Retriever-Greyhound pedigree for linkage analysis of hip dysplasia and osteoarthritis. Am. J. Vet. Res. 64, 418–424 (2003)

Sidjanin, D. J. et al. Canine CNGB3 mutations establish cone degeneration as orthologous to the human achromatopsia locus ACHM3. Hum. Mol. Genet. 11, 1823–1833 (2002)

Lou, X. Y. et al. The extent and distribution of linkage disequilibrium in a multi-hierarchic outbred canine pedigree. Mamm. Genome 14, 555–564 (2003)

Hyun, C. et al. Prospects for whole genome linkage disequilibrium mapping in domestic dog breeds. Mamm. Genome 14, 640–649 (2003)

Cardon, L. R. & Abecasis, G. R. Using haplotype blocks to map human complex trait loci. Trends Genet. 19, 135–140 (2003)

Tsui, C. et al. Single nucleotide polymorphisms (SNPs) that map to gaps in the human SNP map. Nucleic Acids Res. 31, 4910–4916 (2003)

Lewis, B. P., Burge, C. B. & Bartel, D. P. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15–20 (2005)

Syvanen, A. C. Toward genome-wide SNP genotyping. Nature Genet. 37 (suppl.), S5–10 (2005)

Ma, B., Tromp, J. & Li, M. PatternHunter: faster and more sensitive homology search. Bioinformatics 18, 440–445 (2002)

Schwartz, S. et al. Human-mouse alignments with BLASTZ. Genome Res. 13, 103–107 (2003)

Blanchette, M. et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715 (2004)

Smit, A. F. A. & Green, P. RepeatMasker ( http://ftp.genome.washington.edu/RM/RepeatMasker.html ).

Yang, Z., Goldman, N. & Friday, A. Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. Mol. Biol. Evol. 11, 316–324 (1994)

Ning, Z., Cox, A. J. & Mullikin, J. C. SSAHA: a fast search method for large DNA databases. Genome Res. 11, 1725–1729 (2001)

Viterbi, A. J. Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Trans. Inform. Process. 13, 260–269 (1967)

Barrett, J. C., Fry, B., Maller, J. & Daly, M. J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005)

The International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003)

Macdonald, D. W. & Sillero-Zubiri, C. in Biology and Conservation of Canids (eds Macdonald, D. W. & Sillero-Zubiri, C.) 1–30 (Oxford Univ. Press, Oxford, 2004)