Robust clustering of functional directional data

Advances in Data Analysis and Classification - Tập 16 - Trang 181-199 - 2021
Pedro C. Álvarez-Esteban1, Luis A. García-Escudero1
1Dpto. de Estadística e Investigación Operativa, IMUVA, Universidad de Valladolid, Valladolid, Spain

Tóm tắt

A robust approach for clustering functional directional data is proposed. The proposal adapts “impartial trimming” techniques to this particular framework. Impartial trimming uses the dataset itself to tell us which appears to be the most outlying curves. A feasible algorithm is proposed for its practical implementation justified by some theoretical properties. A “warping” approach is also introduced which allows including controlled time warping in that robust clustering procedure to detect typical “templates”. The proposed methodology is illustrated in a real data analysis problem where it is applied to cluster aircraft trajectories.

Tài liệu tham khảo

Banerjee A, Dhillon I, Ghosh J, Sra S (2003) Generative model-based clustering of directional data. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. pp. 19–28 Cuesta-Albertos JA, Fraiman R (2007) Impartial trimmed \(k\)-means for functional data. Comput Stat Data Anal 51:4864–4877 Cuesta-Albertos JA, Gordaliza A, Matrán C (1997) Trimmed \(k\)-means: an attempt to robustify quantizers. Ann Stat 25:553–576 Ferraty F, Vieu P (2006) Nonparametric functional data analysis. Springer Series in Statistics, Springer, New York García-Escudero L, Gordaliza A, Matrán C, Mayo-Iscar A (2008) A general trimming approach to robust cluster analysis. Ann Stat 36:1324–1345 García-Escudero L, Gordaliza A, Matrán C, Mayo-Iscar A (2010) A review of robust clustering methods. Adv Data Anal Classif 4:89–109 García-Escudero L, Gordaliza A, Matrán C, Mayo-Iscar A, Hennig C (2015) Robustness and Outliers, chapter 29, Chapman & Hall/CRC handbooks of modern statistical methods. Taylor & Francis. pp. 653–678 García-Escudero LA, Gordaliza A (1999) Robustness properties of \(k\) means and trimmed \(k\) means. J Am Stat Assoc 94:956–969 García-Escudero LA, Gordaliza A (2005) A proposal for robust curve clustering. J Classif 22:185–201 García-Escudero LA, Gordaliza A, Matrán C (2003) Trimming tools in exploratory data analysis. J Comput Gr Stat 12:434–449 Giorgino T (2009) Computing and visualizing dynamic time warping alignments in R: the dtw package. J Stat Softw 31(7):1–24 Gordaliza A (1991) Best approximations to random variables based on trimming procedures. J Approx Theor 64:162–180 Hitchcock D, Greenwood M (2015) Robustness and Outliers, chapter 13, Chapman & Hall/CRC Handbooks of modern statistical methods. Taylor & Francis. pp. 265–287 Jacques J, Preda C (2014) Functional data clustering: a survey. Adv Data Anal Classif 8:231–255 Kruskal JB, Liberman M (1983) The symmetric time-warping problem: from continuous to discrete. In: Sankoff D, Kruskal JB (eds) Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison. Addison-Wesley Publishing Company, pp. 125–161 Ley C, Verdebout T (2017) Modern directional statistics. CRC Press Mardia KV, Jupp PE (2009) Directional statistics. Wiley, New York Marron JS, Ramsay JO, Sangalli LM, Srivastava A (2015) Functional data analysis of amplitude and phase variation. Stat Sci 30:468–484 Ramsay JO, Silverman BW (2005) Functional data analysis. Springer Series in Statistics, Springer, New York Ritter G (2015) Robust cluster analysis and variable selection, volume 137 of Monographs on statistics and applied probability. CRC Press, Boca Raton Rivera-García D, García-Escudero LA, Mayo-Iscar A, Ortega J (2019) Robust clustering for functional data based on trimming and constraints. Adv Data Anal Classif 13:201–225 Rousseeuw PJ (1984) Least median of squares regression. J Am stat Assoc 79:871–880 Sakoe H, Chiba S (1971) A dynamic programming approach to continuous speech recognition. In: Proceedings of the seventh international congress on acoustics Sangalli LM, Secchi P, Vantini S, Vitelli V (2010) \(K\)-mean alignment for curve clustering. Comput Stat Data Anal 54(5):1219–1233 Srivastava A, Klassen EP (2016) Functional and shape data analysis. Springer, New York Yassouridis C, Leisch F (2017) Benchmarking different clustering algorithms on functional data. Adv Data Anal Classif 11:467–492