The generation of random ultrametric matrices representing dendrograms
Tóm tắt
Many methods and algorithms to generate random trees of many kinds have been proposed in the literature. No procedure exists however for the generation of dendrograms with randomized fusion levels. Randomized dendrograms can be obtained by randomizing the associated cophenetic matrix. Two algorithms are described. The first one generates completely random dendrograms, i.e., trees with a random topology, random fusion level values, and random assignment of the labels. The second algorithm uses a double-permutation procedure to randomize a given dendrogram; it proceeds by randomization of the fixed fusion levels, instead of using random fusion level values. A proof is presented that the double-permutation procedure is a Uniform Random Generation Algorithmsensu Furnas (1984), and a complete example is given.
Tài liệu tham khảo
COLLESS, D. H. (1980), “Congruence Between Morphometric and Allozyme Data forMenidia Species: a Reappraisal,”Systematic Zoology, 29, 288–299.
DE SOETE, G. (1984), “Ultrametric Tree Representations of Incomplete Dissimilarity Data,”Journal of Classification, 1, 235–242.
FAITH, D.P., and BELBIN, L. (1986), “Comparison of Classifications Using Measures Intermediate Between Metric Dissimilarity and Consensus Similarity,”Journal of Classification, 3, 257–280.
FELSENSTEIN, J. (1978), “The Number of Evolutionary Trees,”Systematic Zoology, 27, 27–33.
FELSENSTEIN, J. (1985), “Confidence Limits on Phylogenies: An Approach Using the Bootstrap,”Evolution, 39, 783–791.
FRANK, O., and SVENSSON, K. (1981), “On Probability Distributions of Single-Linkage Dendrograms,”Journal of Statistics and Computer Simulation, 12, 121–131.
FURNAS, G. W. (1984), “The Generation of Random, Binary Unordered Trees,”Journal of Classification, 1, 187–233.
GÖBEL, F. (1980), “On a 1-1 Correspondence Between Rooted Trees and Natural Numbers,”Journal of Combinatorial Theory, Series B, 29, 141–143.
GOWER, J. C., and LEGENDRE, P. (1986), “Metric and Euclidean Properties of Dissimilarity Coefficients,”Journal of Classification, 3, 5–48.
GUENOCHE, A. (1983), “Random Spanning Trees,”Journal of Algorithms, 4, 214–220.
HAJDU L. J. (1981), “Graphical Comparison of Resemblance Measures in Phytosociology,”Vegetatio, 48, 47–59.
HARDING, E. F. (1971), “The Probabilities of Rooted Tree-Shapes Generated by Random Bifurcation,”Advances in Applied Probability, 3, 44–77.
HARTIGAN, J. A. (1967), “Representation of Similarity Matrices by Trees,”Journal of the American Statistical Association, 62, 1140–1158.
KNOTT, G. D. (1977), “A Numbering System for Binary Trees,”Communication of the Association for Computing Machinery,20(2).
LAPOINTE, F.-J., and LEGENDRE, P. (1990), “A Statistical Framework to Test the Consensus of Two Nested Classifications,”Systematic Zoology, 39, 1–14.
LAPOINTE, F.-J., and LEGENDRE, P. (submitted), “A Statistical Framework to Test the Consensus among Additive Trees (Cladograms).”
MICKEVICH, M. F. (1978), “Taxonomic Congruence,”Systematic Zoology, 27, 143–158.
MURTAGH, F. (1983), “A Probability Theory of Hierarchic Clustering Using Random Dendrograms,”Journal of Statistics and Computer Simulation, 18, 145–157.
MURTAGH, F. (1984), “Counting Dendrograms: A Survey,”Discrete Applied Mathematics, 7, 191–199.
NEMEC, A. F. L., and BRINKHURST, R. O. (1988), “Using the Bootstrap to Assess Statistical Significance in the Cluster Analysis of Species Abundance Data,”Canadian Journal of Fisheries and Aquatic Sciences, 45, 965–970.
NIJENHUIS, A., and WILF, H. S. (Eds.) (1978),Combinatorial Algorithms for Computers and Calculators, Second Edition, New York: Academic Press.
ODEN, N. L., and SHAO, K. T. (1984), “An Algorithm to Equiprobably Generate All Directed Trees With k Labeled Terminal Nodes and Unlabeled Interior Nodes,”Bulletin of Mathematical Biology, 46, 379–387.
PAGE, R. D. M. (1988), “Quantitative Cladistic Biogeography: Constructing and Comparing Area Cladograms,”Systematic Zoology, 37, 254–270.
PAGE, R. D. M. (1990), “Temporal Congruence and Cladistic Analysis of Biogeography and Cospeciation,”Systematic Zoology, 39, 205–226.
PHIPPS J. B. (1975), “The Numbers of Classifications,”Canadian Journal of Botany, 54, 686–688.
PROSKUROWSKI, A. (1980), “On the Generation of Binary Trees,”Journal of the Association for Computing Machinery, 27, 1–2.
QUIROZ, A. J. (1989), “Fast Random Generation of Binary, t-ary and Other Types of Trees,”Journal of Classification, 6, 223–231.
ROHLF, F. J. (1983), “Numbering Binary Trees With Labeled Terminal Vertices,”Bulletin of Mathematical Biology, 45, 33–40.
ROSEN, D. E. (1978), “Vicariant Patterns and Historical Explanation in Biogeography,”Systematic Zoology, 27, 159–188.
ROTEM, D., and VAROL, Y. L. (1978), “Generation of Binary Trees from Ballot Sequences,”Journal of the Association for Computing Machinery, 25, 396–404.
SHAO, K., and ROHLF, F. J. (1983), “Sampling Distributions of Consensus Indices when all Bifurcating Trees are Equally Likely” inNumerical Taxonomy, ed., J. Felsenstein, NATO Advanced Studies Institute, Ser. G. (Ecological Sciences) 1, Berlin: Springer Verlag, 132–137.
SHAO, K., and SOKAL, R. R. (1986), “Significance Tests of Consensus Indices,”Systematic Zoology, 35, 582–590.
SIBSON, R. (1972), “Order Invariant Methods for Data Analysis,”Journal of the Royal Statistical Society, B 34, 311–349.
SNEATH, P. H. A., and SOKAL, R. R. (1973),Numerical Taxonomy, San Francisco: Freeman.
SOKAL, R. R., and ROHLF, F. J. (1962), “The Comparison of Dendrograms by Objective Methods,”Taxon, 11, 33–40.
SOLOMON, M., and FINKEL, R. A. (1980), “A Note on Enumerating Binary Trees,”Journal of the Association for Computing Machinery, 27, 3–5.
