Symmetry in data mining and analysis: A unifying view based on hierarchy

Proceedings of the Steklov Institute of Mathematics - Tập 265 Số 1 - Trang 177-198 - 2009
Fionn Murtagh1
1Science Foundation Ireland, Wilton Park House, Wilton Place, Dublin 2, Ireland

Tóm tắt

Từ khóa


Tài liệu tham khảo

C. Bandt, “Ordinal Time Series Analysis,” Ecol. Modell. 182, 229–238 (2005).

C. Bandt and B. Pompe, “Permutation Entropy: A Natural Complexity Measure for Time Series,” Phys. Rev. Lett. 88(17), 174102 (2002).

C. Bandt and F. Shiha, “Order Patterns in Time Series,” J. Time Series Anal. 28(5), 646–665 (2007); http://www.math-inf.uni-greifswald.de/~bandt/pub/orderpattern3.pdf

R. G. Baraniuk, V. Cevher, M. F. Duarte, and C. Hegde, “Model-Based Compressive Sensing,” arXiv: 0808.3572.

J. J. Benedetto and R. L. Benedetto, “A Wavelet Theory for Local Fields and Related Groups,” J. Geom. Anal. 14, 423–456 (2004).

R. L. Benedetto, “Examples of Wavelets for Local Fields,” in Wavelets, Frames and Operator Theory, Ed. by C. Heil, P. E. T. Jorgensen, and D. R. Larson (Am. Math. Soc., Providence, RI, 2004), Contemp. Math. 345, pp. 27–47.

J.-P. Benzécri, L’analyse des données, Vol. 1: La taxinomie, 2nd ed. (Dunod, Paris, 1979).

P. E. Bradley, “Mumford Dendrograms,” Comput. J., doi:10.1093/comjnl/bxm088 (2008).

L. Brekke and P. G. O. Freund, “p-Adic Numbers in Physics,” Phys. Rep. 233, 1–66 (1993).

P. Chakraborty, “Looking through Newly to the Amazing Irrationals,” arXiv:math/0502049v1.

M. Costa, A. L. Goldberger, and C.-K. Peng, “Multiscale Entropy Analysis of Biological Signals,” Phys. Rev. E 71(2), 021906 (2005).

F. Critchley and W. Heiser, “Hierarchical Trees Can Be Perfectly Scaled in One Dimension,” J. Classif. 5, 5–20 (1988).

B. A. Davey and H. A. Priestley, Introduction to Lattices and Order, 2nd ed. (Cambridge Univ. Press, Cambridge, 2002).

F. Delon, “Espaces ultramétriques,” J. Symb. Log. 49, 405–424 (1984).

S. B. Deutsch and J. J. Martin, “An Ordering Algorithm for Analysis of Data Arrays,” Oper. Res. 19, 1350–1362 (1971).

D. L. Donoho and J. Tanner, “Neighborliness of Randomly-Projected Simplices in High Dimensions,” Proc. Natl. Acad. Sci. USA 102, 9452–9457 (2005).

B. Dragovich and A. Dragovich, “p-Adic Modelling of the Genome and the Genetic Code,” Comput. J., doi:10.1093/comjnl/bxm083 (2007).

R. A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Ann. Eugen. 7, 179–188 (1936).

R. Foote, “An Algebraic Approach to Multiresolution Analysis,” Trans. Am. Math. Soc. 357, 5031–5050 (2005).

R. Foote, “Mathematics and Complex Systems,” Science 318, 410–412 (2007).

R. Foote, G. Mirchandani, D. N. Rockmore, D. Healy, and T. Olson, “A Wreath Product Group Approach to Signal and Image Processing. I: Multiresolution Analysis,” IEEE Trans. Signal Process. 48, 102–132 (2000).

R. Foote, G. Mirchandani, D. N. Rockmore, D. Healy, and T. Olson, “A Wreath Product Group Approach to Signal and Image Processing. II: Convolution, Correlation, and Applications,” IEEE Trans. Signal Process. 48, 749–767 (2000).

P. G. O. Freund, “p-Adic Strings and Their Applications,” in p-Adic Mathematical Physics: Proc. 2nd Int. Conf., Belgrade, 2005, Ed. by A. Yu. Khrennikov, Z. Rakić, and I. V. Volovich (Am. Inst. Phys., Melville, NY, 2006), AIP Conf. Proc. 826, pp. 65–73.

L. Gajić, “On Ultrametric Space,” Novi Sad J. Math. 31, 69–71 (2001).

B. Ganter and R. Wille, Formale Begriffsanalyse. Mathematische Grundlagen (Springer, Berlin, 1996). Engl. transl.: Formal Concept Analysis: Mathematical Foundations (Springer, Berlin, 1999).

F. Q. Gouvêa, p-Adic Numbers: An Introduction (Springer, Berlin, 2003).

P. Hall, J. S. Marron, and A. Neeman, “Geometric Representation of High Dimension, Low Sample Size Data,” J. R. Stat. Soc. B 67, 427–444 (2005).

P. Hitzler and A. K. Seda, “The Fixed-Point Theorems of Priess-Crampe and Ribenboim in Logic Programming,” Fields Inst. Commun. 32, 219–235 (2002).

A. K. Jain and R. C. Dubes, Algorithms for Clustering Data (Prentice-Hall, Englewood Cliffs, NJ, 1988).

A. K. Jain, M. N. Murty, and P. J. Flynn, “Data Clustering: A Review,” ACM Comput. Surv. 31, 264–323 (1999).

M. F. Janowitz, “An Order Theoretic Model for Cluster Analysis,” SIAM J. Appl. Math. 34, 55–72 (1978).

M. F. Janowitz, “Cluster Analysis Based on Abstract Posets,” Tech. rep. (2005–2006), http://dimax.rutgers.edu/~melj/poset_paper.pdf

M. Jansen, G. P. Nason, and B. W. Silverman, “Multiscale Methods for Data on Graphs and Irregular Multidimensional Situations,” J. R. Stat. Soc. B 71, 97–125 (2009).

S. C. Johnson, “Hierarchical Clustering Schemes,” Psychometrika 32, 241–254 (1967).

K. Keller and H. Lauffer, “Symbolic Analysis of High-Dimensional Time Series,” Int. J. Bifurcation Chaos Appl. Sci. Eng. 13, 2657–2668 (2003).

K. Keller, H. Lauffer, and M. Sinn, “Ordinal Analysis of EEG Time Series,” Chaos and Complexity Lett. 2, 247–258 (2007).

K. Keller and M. Sinn, “Ordinal Analysis of Time Series,” Physica A 356, 114–120 (2005).

K. Keller and M. Sinn, “Ordinal Symbolic Dynamics,” Tech. Rep. A-05-14 (Inst. Math. Univ. Lübeck, 2005), http://www.math.uni-luebeck.de/mitarbeiter/keller/wwwpapers/osdc.pdf

A. Khrennikov, Information Dynamics in Cognitive, Psychological, Social and Anomalous Phenomena (Kluwer, Dordrecht, 2004).

A. Yu. Khrennikov, “Gene Expression from Polynomial Dynamics in the 2-adic Information Space,” arXiv: q-bio/0611068v2.

F. Klein, Vergleichende Betrachtungen über neuere geometrische Forschungen (1872). Engl. transl.: “A Comparative Review of Recent Researches in Geometry,” Bull. New York Math. Soc. 2, 215–249 (1892–1893).

S. V. Kozyrev, “Wavelet Theory as p-adic Spectral Analysis,” Izv. Ross. Akad. Nauk, Ser. Mat. 66(2), 149–158 (2002) [Izv. Math. 66, 367–376 (2002)].

S. V. Kozyrev, “Wavelets and Spectral Analysis of Ultrametric Pseudodifferential Operators,” Mat. Sb. 198(1), 103–126 (2007) [Sb. Math. 198, 97–116 (2007)].

M. Krasner, “Nombres semi-réels et espaces ultramétriques,” C. R. Acad. Sci. Paris 219, 433–435 (1944).

V. Latora and M. Baranger, “Kolmogorov-Sinai Entropy Rate versus Physical Entropy,” Phys. Rev. Lett. 82, 520–523 (1999).

I. C. Lerman, Classification et analyse ordinale des données (Dunod, Paris, 1981).

A. Levy, Basic Set Theory (Dover Publ., Mineola, NY, 2002).

S. C. Madeira and A. L. Oliveira, “Biclustering Algorithms for Biological Data Analysis: A Survey,” IEEE/ACM Trans. Comput. Biol. Bioinform. 1, 24–45 (2004).

S. T. March, “Techniques for Structuring Database Records,” Comput. Surv. 15, 45–79 (1983).

W. T. McCormick, Jr., P. J. Schweitzer, and T. J. White, “Problem Decomposition and Data Reorganization by a Clustering Technique,” Oper. Res. 20, 993–1009 (1972).

I. Van Mechelen, H.-H. Bock, and P. De Boeck, “Two-Mode Clustering Methods: A Structured Overview,” Stat. Methods Med. Res. 13, 363–394 (2004).

B. Mirkin, Mathematical Classification and Clustering (Kluwer, Dordrecht, 1996).

B. Mirkin, Clustering for Data Mining (Chapman and Hall/CRC Press, Boca Raton, FL, 2005).

F. Murtagh, “A Survey of Recent Advances in Hierarchical Clustering Algorithms,” Comput. J. 26, 354–359 (1983).

F. Murtagh, “Complexities of Hierarchic Clustering Algorithms: State of the Art,” Comput. Stat. Q. 1, 101–113 (1984).

F. Murtagh, “Counting Dendrograms: A Survey,” Discrete Appl. Math. 7, 191–199 (1984).

F. Murtagh, Multidimensional Clustering Algorithms (Physica-Verlag, Vienna, 1985).

F. Murtagh, “Comments on ‘Parallel Algorithms for Hierarchical Clustering and Cluster Validity’,” IEEE Trans. Pattern Anal. Mach. Intell. 14, 1056–1057 (1992).

F. Murtagh, “On Ultrametricity, Data Coding, and Computation,” J. Classif. 21, 167–184 (2004).

F. Murtagh, “Identifying the Ultrametricity of Time Series,” Eur. Phys. J. B 43, 573–579 (2005).

F. Murtagh, “The Haar Wavelet Transform of a Dendrogram,” J. Classif. 24, 3–32 (2007).

F. Murtagh, “The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering,” J. Classif. (2009) (in press).

F. Murtagh, “The Correspondence Analysis Platform for Uncovering Deep Structure in Data and Information,” Comput. J., doi:10.1093/comjnl/bxn045 (2008).

F. Murtagh, G. Downs, and P. Contreras, “Hierarchical Clustering of Massive, High Dimensional Data Sets by Exploiting Ultrametric Embedding,” SIAM J. Sci. Comput. 30, 707–730 (2008).

F. Murtagh, J.-L. Starck, and M. W. Berry, “Overcoming the Curse of Dimensionality in Clustering by Means of the Wavelet Transform,” Comput. J. 43, 107–120 (2000).

A. Ostrowski, “Über einige Lösungen der Funktionalgleichung ϕ(x) · ϕ(y) − ϕ(xy),” Acta Math. 41, 271–284 (1917).

R. Rammal, J. C. Angles d’Auriac, and B. Doucot, “On the Degree of Ultrametricity,” J. Phys. Lett. 46, 945–952 (1985).

R. Rammal, G. Toulouse, and M. A. Virasoro, “Ultrametricity for Physicists,” Rev. Mod. Phys. 58, 765–788 (1986).

H. Reiter and J. D. Stegeman, Classical Harmonic Analysis and Locally Compact Groups, 2nd ed. (Oxford Univ. Press, Oxford, 2000).

A. C. M. Van Rooij, Non-Archimedean Functional Analysis (M. Dekker, New York, 1978).

W. H. Schikhof, Ultrametric Calculus (Cambridge Univ. Press, Cambridge, 1984), Chs. 18–21.

A. K. Seda and P. Hitzler, “Generalized Distance Functions in the Theory of Computation,” Comput. J., doi:10.1093/comjnl/bxm108 (2008).

R. Sibson, “SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method,” Comput. J. 16, 30–34 (1973).

H. A. Simon, The Sciences of the Artificial (MIT Press, Cambridge, MA, 1996).

N. J. A. Sloane, “Sequence A000111,” in On-line Encyclopedia of Integer Sequences, http://www.research.att.com/~njas/sequences/A000111

D. Steinley, “K-Means Clustering: A Half-Century Synthesis,” Br. J. Math. Stat. Psychol. 59, 1–3 (2006).

D. Steinley and M. J. Brusco, “Initializing K-Means Batch Clustering: A Critical Evaluation of Several Techniques,” J. Classif. 24, 99–121 (2007).

Wu-Ki Tung, Group Theory in Physics (World Sci., Singapore, 1985).

S. S. Vempala, The Random Projection Method (Am. Math. Soc., Providence, RI, 2004), DIMACS Ser. Discrete Math. Theor. Comput. Sci. 65.

I. V. Volovich, “Number Theory as the Ultimate Physical Theory,” Preprint No. TH 4781/87 (CERN, Geneva, 1987).

I. V. Volovich, “p-Adic String,” Class. Quantum Grav. 4, L83–L87 (1987).

W. Weckesser, “Symbolic Dynamics in Mathematics, Physics, and Engineering,” Tech. Rep. (1997), http://www.ima.umn.edu/~weck/nbt/nbt.ps

H. Weyl, Symmetry (Princeton Univ. Press, Princeton, 1983).

Rui Xu and D. Wunsch II, “Survey of Clustering Algorithms,” IEEE Trans. Neural Netw. 16, 645–678 (2005).