How frugal is mother nature with haplotypes?

Bioinformatics (Oxford, England) - Tập 25 Số 1 - Trang 68-74 - 2009
Sharlee Climer1, Gerold Jäger1, Alan R. Templeton1, Weixiong Zhang1
11 Department of Computer Science and Engineering, 2Department of Cardiology, 3Division of Biostatistics, 4Department of Biology and 5Department of Genetics, Washington University, St. Louis, MO, USA

Tóm tắt

Abstract Motivation: Inference of haplotypes from genotype data is crucial and challenging for many vitally important studies. The first, and most critical step, is the ascertainment of a biologically sound model to be optimized. Many models that have been proposed rely partially or entirely on reducing the number of unique haplotypes in the solution. Results: This article examines the parsimony of haplotypes using known haplotypes as well as genotypes from the HapMap project. Our study reveals that there are relatively few unique haplotypes, but not always the least possible, for the datasets with known solutions. Furthermore, we show that there are frequently very large numbers of parsimonious solutions, and the number increases exponentially with increasing cardinality. Moreover, these solutions are quite varied, most of which are not consistent with the true solutions. These results quantify the limitations of the Pure Parsimony model and demonstrate the imperative need to consider additional properties for haplotype inference models. At a higher level, and with broad applicability, this article illustrates the power of combinatorial methods to tease out imperfections in a given biological model. Contact:  [email protected]

Từ khóa


Tài liệu tham khảo

Andrés, 2007, Understanding the accuracy of statistical haplotype inference with sequence data of known phase, Genet. Epi., 31, 659, 10.1002/gepi.20185

Blain, 2005, Mathematical approaches to the pure parsimony problem, Technical Report S40

Brown, 2004, A new integer programming formulation for the pure parsimony problem in haplotype analysis, Proceedings of 2004 Workshop on Algorithms in Bioinformatics (WABI), 254

Brown, 2006, Integer programming approaches to haplotype inference by pure parsimony, IEEE/ACM Trans. Comput. Biol. Bioinform., 3, 141, 10.1109/TCBB.2006.24

Brown, 2006, Toward an algebraic understanding of haplotype inference pure parsimony, Proceedings of the Computational Systems Bioinformatics Conference, 211, 10.1142/9781860947575_0027

Cilibrasi, 2005, On the complexity of several haplotyping problems, 5th Workshop on Algorithms in Bioinformatics (WABI 2005)., 128, 10.1007/11557067_11

Clark, 1990, Inference of haplotypes from PCR-amplified samples of diploid populations, Mol. Biol. Evol., 77, 111

Di Gaspero, 2008, Stochastic local search for large-scale instances of the haplotype inference problem by pure parsimony, J. Algorithm. Logic Inform. Cogn., 63, 55

Eskin, 2003, Large scale reconstruction of haplotypes from genotype data, Proceedings of the Seventh Annual International Conference on Research in Computational Molecular Biology., 104, 10.1145/640075.640088

Garey, 1979, Computers and Intractability: A Guide to the Theory of NP-Completeness.

Graca, 2007, Efficient haplotype inference with pseudo-boolean optimization, Proceedings of Algebraic Biology, 125, 10.1007/978-3-540-73433-8_10

Gusfield, 2000, A practical algorithm for optimal inference of haplotypes from diploid populations, Proceedings of the Eighth International Conference on Intelligent System for Molecular Biology, 183

Gusfield, 2002, Haplotyping as perfect phylogeny: conceptual framework and efficient solutions, Research in Computational Molecular Biology (RECOMB '02), 166

Gusfield, 2003, Haplotype inference by pure parsimony, 14th Annual Symposium on Combinatorial Pattern Matching (CPM'03)., 144, 10.1007/3-540-44888-8_11

Gusfield, 2005, Haplotype inference, Handbook on Bioinformatics, 1

Halldorsson, 2003, Combinatorial problems arising in SNP and haplotype analysis, Discrete Mathematics and Theoretical Computer Science, Proceedings of DMTCS 2003., 26

Huang, 2005, An approximation algorithm for haplotype inference by maximum parsimony, J. Comput. Biol., 12, 1261, 10.1089/cmb.2005.12.1261

Lancia, 2006, A polynomial case of the parsimony haplotyping problem, Operations Res. Lett., 34, 289, 10.1016/j.orl.2005.05.007

Lancia, 2004, Haplotyping populations by pure parsimony: Complexity of exact and approximation algorithms, INFORMS J. Comput., 16, 348, 10.1287/ijoc.1040.0085

Li, 2005, A parsimonious tree-grow method for haplotype inference, Bioinformatics, 21, 3475, 10.1093/bioinformatics/bti572

Lynce, 2006, Efficient haplotype inference with Boolean satisfiability, National Conference in Artificial Intelligence (AAAI), 104

Marques-Silva, 2006, Efficient and tight upper bounds for haplotype inference, Technical Report 16

Marques-Silva, 2006, Improved lower bounds for SAT-based haplotype inference, Technical Report 17

Martello, 1987, Linear assignment problems, Ann. Discrete Math., 31, 259

Myers, 2005, A fine-scale map of recombination rates and hotspots across the human genome, Science, 310, 321, 10.1126/science.1117196

Niu, 2002, Bayesian haplotype inference for multiple linked single nucleotide polymorphisms, Am. J. Hum. Genet., 70, 157, 10.1086/338446

Orzack, 2003, Analysis and exploration of the use of rule-based algorithms and consensus methods for the inferral of haplotypes, Genetics, 165, 915, 10.1093/genetics/165.2.915

Schneider, 1996, Searching for backbones – an efficient parallel algorithm for the traveling salesman problem, Comput. Phys. Commun., 96, 173, 10.1016/0010-4655(96)00062-8

Sharan, 2006, Islands of tractability for parsimony haplotyping, IEEE/ACM Trans. Comput. Biol. Bioinform., 3, 303, 10.1109/TCBB.2006.40

Stephens, 2003, A comparison of Bayesian methods for haplotype reconstruction from population genotype data, Am. J. Hum. Genet., 73, 1162, 10.1086/379378

Templeton, 2005, Tree scanning: a method for using haplotype trees in genotype/phenotype association studies, Genetics, 169, 441, 10.1534/genetics.104.030080

The International HapMap Consortium, 2003, The International HapMap Project, Nature, 426, 789, 10.1038/nature02168

The International HapMap Consortium, 2005, A haplotype map of the human genome, Nature, 437, 1299, 10.1038/nature04226

Wang, 2006, Computational experiments on algorithms for haplotype inference problems by pure parsimony, Proceedings of the 9th Joint Conference on Information Sciences (JCIS), 824, 10.2991/jcis.2006.243

Wang, 2003, Haplotype inference by maximum parsimony, Bioinformatics, 19, 1773, 10.1093/bioinformatics/btg239

Wang, 2005, Haplotype inference by pure parsimony via genetic algorithm, Operations Research and Its Applications, The Fifth International Symposium, (ISORA'05), 308