Robust clustering of functional directional data
Tóm tắt
A robust approach for clustering functional directional data is proposed. The proposal adapts “impartial trimming” techniques to this particular framework. Impartial trimming uses the dataset itself to tell us which appears to be the most outlying curves. A feasible algorithm is proposed for its practical implementation justified by some theoretical properties. A “warping” approach is also introduced which allows including controlled time warping in that robust clustering procedure to detect typical “templates”. The proposed methodology is illustrated in a real data analysis problem where it is applied to cluster aircraft trajectories.
Tài liệu tham khảo
Banerjee A, Dhillon I, Ghosh J, Sra S (2003) Generative model-based clustering of directional data. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. pp. 19–28
Cuesta-Albertos JA, Fraiman R (2007) Impartial trimmed \(k\)-means for functional data. Comput Stat Data Anal 51:4864–4877
Cuesta-Albertos JA, Gordaliza A, Matrán C (1997) Trimmed \(k\)-means: an attempt to robustify quantizers. Ann Stat 25:553–576
Ferraty F, Vieu P (2006) Nonparametric functional data analysis. Springer Series in Statistics, Springer, New York
García-Escudero L, Gordaliza A, Matrán C, Mayo-Iscar A (2008) A general trimming approach to robust cluster analysis. Ann Stat 36:1324–1345
García-Escudero L, Gordaliza A, Matrán C, Mayo-Iscar A (2010) A review of robust clustering methods. Adv Data Anal Classif 4:89–109
García-Escudero L, Gordaliza A, Matrán C, Mayo-Iscar A, Hennig C (2015) Robustness and Outliers, chapter 29, Chapman & Hall/CRC handbooks of modern statistical methods. Taylor & Francis. pp. 653–678
García-Escudero LA, Gordaliza A (1999) Robustness properties of \(k\) means and trimmed \(k\) means. J Am Stat Assoc 94:956–969
García-Escudero LA, Gordaliza A (2005) A proposal for robust curve clustering. J Classif 22:185–201
García-Escudero LA, Gordaliza A, Matrán C (2003) Trimming tools in exploratory data analysis. J Comput Gr Stat 12:434–449
Giorgino T (2009) Computing and visualizing dynamic time warping alignments in R: the dtw package. J Stat Softw 31(7):1–24
Gordaliza A (1991) Best approximations to random variables based on trimming procedures. J Approx Theor 64:162–180
Hitchcock D, Greenwood M (2015) Robustness and Outliers, chapter 13, Chapman & Hall/CRC Handbooks of modern statistical methods. Taylor & Francis. pp. 265–287
Jacques J, Preda C (2014) Functional data clustering: a survey. Adv Data Anal Classif 8:231–255
Kruskal JB, Liberman M (1983) The symmetric time-warping problem: from continuous to discrete. In: Sankoff D, Kruskal JB (eds) Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison. Addison-Wesley Publishing Company, pp. 125–161
Ley C, Verdebout T (2017) Modern directional statistics. CRC Press
Mardia KV, Jupp PE (2009) Directional statistics. Wiley, New York
Marron JS, Ramsay JO, Sangalli LM, Srivastava A (2015) Functional data analysis of amplitude and phase variation. Stat Sci 30:468–484
Ramsay JO, Silverman BW (2005) Functional data analysis. Springer Series in Statistics, Springer, New York
Ritter G (2015) Robust cluster analysis and variable selection, volume 137 of Monographs on statistics and applied probability. CRC Press, Boca Raton
Rivera-García D, García-Escudero LA, Mayo-Iscar A, Ortega J (2019) Robust clustering for functional data based on trimming and constraints. Adv Data Anal Classif 13:201–225
Rousseeuw PJ (1984) Least median of squares regression. J Am stat Assoc 79:871–880
Sakoe H, Chiba S (1971) A dynamic programming approach to continuous speech recognition. In: Proceedings of the seventh international congress on acoustics
Sangalli LM, Secchi P, Vantini S, Vitelli V (2010) \(K\)-mean alignment for curve clustering. Comput Stat Data Anal 54(5):1219–1233
Srivastava A, Klassen EP (2016) Functional and shape data analysis. Springer, New York
Yassouridis C, Leisch F (2017) Benchmarking different clustering algorithms on functional data. Adv Data Anal Classif 11:467–492