Phương pháp phân tích định lượng điểm khuỷu cho số lượng cụm tối ưu trong thuật toán phân cụm
Tóm tắt
Từ khóa
Tài liệu tham khảo
D. Xu, Y. Tian, A comprehensive survey of clustering algorithms. Ann. Data Sci. 2(2), 165–193 (2015). https://doi.org/10.1007/s40745-015-0040-1
A.K. Jain, Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 31(8), 651–666 (2010). https://doi.org/10.1016/j.patrec.2009.09.011
H.-S. Park, C.-H. Jun, A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009). https://doi.org/10.1016/j.eswa.2008.01.039
C.A. Sugar, G.M. James, Finding the number of clusters in a dataset. J. Am. Stat. Assoc. 98(463), 750–763 (2003). https://doi.org/10.1198/016214503000000666
T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: an efficient data clustering method for very large databases. ACM SIGMOD Record. 25(2), 103–114 (1996). https://doi.org/10.1145/235968.233324
S. Guha, R. Rastogi, K. Shim, Cure: an efficient clustering algorithm for large databases. Inf. Syst. 26(1), 35–58 (2001). https://doi.org/10.1016/S0306-4379(01)00008-4
S. Guha, R. Rastogi, K. Shim, Rock: a robust clustering algorithm for categorical attributes. Inf. Syst. 25(5), 345–366 (2000). https://doi.org/10.1016/S0306-4379(00)00022-3
J.C. Bezdek, R. Ehrlich, W. Full, FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984). https://doi.org/10.1016/0098-3004(84)90020-7
R.N. Dave, K. Bhaswan, Adaptive fuzzy c-shells clustering and detection of ellipses. IEEE Trans. Neural Netw. 3(5), 643–662 (1992). https://doi.org/10.1109/72.159055
R.R. Yager, D.P. Filev, Approximate clustering via the mountain method. IEEE Trans. Syst. Man Cybern. 24(8), 1279–1284 (1994). https://doi.org/10.1109/21.299710
R.L. Thorndike, Who belongs in the family? Psychometrika 18(4), 267–276 (1953). https://doi.org/10.1007/BF02289263
M.A. Syakur, B.K. Khotimah, E.M.S. Rochman, B.D. Satoto, Integration K-means clustering method and elbow method for identification of the best customer profile cluster. IOP Conf. Ser. Mater. Sci. Eng. 335, 012017 (2018). https://doi.org/10.1088/1757-899X/336/1/012017
D.J. Ketchen, C.L. Shook, The application of cluster analysis in strategic management research: an analysis and critique. Strateg. Manag. J. 17(6), 441–458 (1996). https://doi.org/10.1002/(SICI)1097-0266(199606)17:6%3c441::AID-SMJ819%3e3.0.CO;2-G
R. Nainggolan, R. Perangin-angin, E. Simarmata, A. Tarigan, Improved the performance of the K-means cluster using the sum of squared error (SSE) optimized by using the Elbow method. J. Phys: Conf. Ser. 1361, 012015 (2019). https://doi.org/10.1088/1742-6596/1361/1/012015
F. Liu, Y. Deng, Determine the number of unknown targets in open World based on Elbow method. IEEE Trans. Fuzzy Syst. (2020). https://doi.org/10.1109/TFUZZ.2020.2966182
J. Yoder, C. Priebe, Semi-supervised k-means++. J. Stat. Comput. Simul. 87(13), 2597–2608 (2017). https://doi.org/10.1080/00949655.2017.1327588
A. Jain, K. Nandakumar, A. Rose, Score normalization in multimodal biometric systems. Patten Recognit. 38(12), 2270–2285 (2005). https://doi.org/10.1016/j.patcog.2005.01.012
E. Hancer, D. Karaboga, A comprehensive survey of traditional, merge-split and evolutionary approaches proposed for determination of cluster number. Swarm Evol. Comput. 32, 49–67 (2017). https://doi.org/10.1016/j.swevo.2016.06.004
M.A. Masud, J.Z. Huang, C. Wei, J. Wang, I. Khan, M. Zhong, I-nice: a new approach for identifying the number of clusters and initial cluster centres. Inf. Sci. 466, 129–151 (2018). https://doi.org/10.1016/j.ins.2018.07.034
M.Z. Rodriguez, C.H. Comin, D. Casanova, O.M. Bruno, D.R. Amancio, L.D.F. Costa, F.A. Rodrigues, Clustering algorithms: a comparative approach. PLoS ONE 14(1), e0210236 (2019). https://doi.org/10.1371/journal.pone.0210236
P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987). https://doi.org/10.1016/0377-0427(87)90125-7
O. Arbelaitz, I. Gurrutxaga, J. Muguerza, J.M. Pérez, I. Perona, An extensive comparative study of cluster validity indices. Pattern Recognit. 46(1), 243–256 (2013). https://doi.org/10.1016/j.patcog.2012.07.021
R. Tibshirani, G. Walther, T. Hastie, Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B-Stat. Methodol. 63(2), 411–423 (2001). https://doi.org/10.1111/1467-9868.00293
C. Yuan, H. Yang, Research on K-value selection method of K-means clustering algorithm. J.-Multidiscip. Sci. J. 2(2), 226–235 (2019). https://doi.org/10.3390/j2020016
H. Liu, L. Fen, J. Jian, L. Chen, Overlapping community discovery algorithm based on hierarchical agglomerative clustering. Int. J. Pattern Recognit. Artif. Intell. 32(03), 1850008 (2018). https://doi.org/10.1142/S0218001418500088
M. Dash, H. Liu, P. Scheuermann, K.L. Tan, Fast hierarchical clustering and its validation. Data Knowl. Eng. 44(1), 109–138 (2003). https://doi.org/10.1016/S0169-023X(02)00138-6
P. Burman, A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3), 503–514 (1989). https://doi.org/10.1093/biomet/76.3.503
S.-S. Yu, S.-W. Chu, C.-M. Wang, Y.-K. Chan, T.-C. Chang, Two improved k-means algorithms. Appl. Soft Comput. 68, 747–755 (2018). https://doi.org/10.1016/j.asoc.2017.08.032
D. Posada, T.R. Buckley, Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst. Biol. 53(5), 793–808 (2004). https://doi.org/10.1080/10635150490522304
J. Ding, V. Tarokh, Y. Yang, Bridging AIC and BIC: a new criterion for autoregression. IEEE Trans. Inf. Theory 64(6), 4024–4043 (2018). https://doi.org/10.1109/TIT.2017.2717599
R. Xu, D. Wunschll, Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005). https://doi.org/10.1109/TNN.2005.845141