A hybrid model for class noise detection using k-means and classification filtering algorithms
Tóm tắt
Từ khóa
Tài liệu tham khảo
Zhu X, Wu X (2004) Class noise vs. attribute noise: a quantitative study of their impacts. Artif Intell Rev 223:177–210
Frénay B, Verleysen M (2014) Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst 255:845–869
Miranda AL, Garcia LPF, Carvalho AC, Lorena AC (2009) Use of classification algorithms in noise detection and elimination. In: International conference on hybrid artificial intelligence systems. Springer, pp 417–424
Sluban B, Lavrač N (2015) Relating ensemble diversity and performance: a study in class noise detection. Neurocomputing 1601:120–131
Lowongtrakool C, Hiransakolwong N (2012) Noise filtering in unsupervised clustering using computation intelligence. Int J Math Anal 659:2911–2920
Srimani PPK, Koti MS (2012) Outlier mining in medical databases by using statistical methods. Int J Eng Sci Technol 401:239–246
Catal C, Alan O, Balkan K (2011) Class noise detection based on software metrics and ROC curves. Inf Sci 18121:4867–4877
Sluban B, Gamberger D, Lavra N (2010) Advances in class noise detection. Front Artif Intell Appl 2151:1105–1106
Van Hulse JD, Khoshgoftaar TM, Huang H (2006) The pairwise attribute noise detection algorithm. Knowl Inf Syst 112:171–190
Xiong H, Pandey G, Member S (2006) Enhancing data analysis with noise removal. IEEE Trans Knowl Data Eng 183:304–319
Zeidat N, Wang S, Eick CF (2005) Dataset editing techniques: a comparative study. University of Houston, Houston
Smith MR, Martinez T, Giraud-Carrier C (2014) An instance level analysis of data complexity. Mach Learn 952:225–256
Thongkam J, Xu G, Zhang Y, Huang F (2008) Support vector machine for outlier detection in breast cancer survivability prediction. In: Advanced web and network technologies, and applications. Springer, pp 99–109
Jeatrakul P, Wong KW, Fung CC (2010) Data cleaning for classification using misclassification analysis. J Adv Comput Intell Intell Inform 143:297–302
Angelova A, Abu-Mostafa Y, Perona P (2005) Pruning training sets for learning of object categories. In: IEEE computer society conference on computer vision and pattern recognition, CVPR 2005, pp 494–501
Segata N, Blanzieri E, Delany SJ, Cunningham P (2010) Noise reduction for instance-based learning with a local maximal margin approach. J Intell Inf Syst 352:301–331
Segata N, Blanzieri E (2009) A scalable noise reduction technique for large case-based systems. In: International conference on case-based reasoning. Springer, Berlin, pp 328–342
Zeng X, Martinez T (2003) A noise filtering method using neural networks. In: IEEE international workshop on soft computing techniques in instrumentation, measurement and related applications, 2003, SCIMA 2003, pp 26–31
Sánchez JS, Barandela R, Marqués AI et al (2003) Analysis of new techniques to obtain quality training sets. Pattern Recogn Lett 247:1015–1022
Sabzevari M, Martínez-Muñoz G, Suárez A (2018) A two-stage ensemble method for the detection of class-label noise. Neurocomputing 275:2374–2383
Fränti P, Sieranoja S (2019) How much can k-means be improved by using better initialization and repeats? Pattern Recogn 93:95–112
Nematzadeh Z, Ibrahim R, Selamat A (2015) A method for class noise detection based on k-means and SVM algorithms. In: Intelligent software methodologies, tools and techniques. Springer, pp 308–318
Singh K, Malik D, Sharma N (2011) Evolving limitations in k-means algorithm in data mining and their removal. Int J Comput Eng Manag 121:105–109
Garcia LPF, Lorena AC, Carvalho ACPLF (2012) A study on class noise detection and elimination. In: 2012 Brazilian symposium on neural networks. Curitiba- PR. 20–25 Oct, pp 13–18
Farid DM, Harbi N, Rahman MZ (2010) Combining Naive Bayes and decision tree for adaptive intrusion detection. arXiv preprint arXiv:1005.4496
Meyer D (2004) Support vector machines: the interface to libsvm in package, p e1071
Li D-f, Hu W-c, Xiong W, Yang J-b (2008) Fuzzy relevance vector machine for learning from unbalanced data and noise. Pattern Recogn Lett 299:1175–1181
Wald R, Khoshgoftaar TM, Shanab AA (2014) The effect of noise level and distribution on classification of easy gene microarray data. In: Proceedings of the 2014 IEEE 15th international conference on information reuse and integration, pp 297–302
Dehariya S, Singh D (2013) An ensemble method based on particle of swarm for the reduction of noise, outlier and core point. Int J Adv Comput Res 31:1–5
Depeursinge A, Iavindrasana J, Hidki A et al (2010) Comparative performance analysis of state-of-the-art classification algorithms applied to lung tissue categorization. J Digit Imaging 231:18–30
Folleco A, Khoshgoftaar TM, Hulse JV, Bullard, L (2008) Software quality modeling: the impact of class noise on the random forest classifier. In: 2008 IEEE congress on evolutionary computation (IEEE world congress on computational intelligence). IEEE, pp 3853–3859
Van Hulse J, Khoshgoftaar T (2009) Knowledge discovery from imbalanced and noisy data. Data Knowl Eng 6812:1513–1542
Daza L, Acuna E (2007) An algorithm for detecting noise on supervised classification. In: Proceedings of WCECS-07, the 1st world conference on engineering and computer science, pp 701–706
Pechenizkiy M, Tsymbal A, Puuronen S et al (2006) Class noise and supervised learning in medical domains: the effect of feature extraction. In: 19th IEEE symposium on computer-based medical systems (CBMS’06), pp 708–713
Lan M, Tan CL, Su J, Lu Y (2009) Supervised and traditional term weighting methods for automatic text categorization. IEEE Trans Pattern Anal Mach Intell 314:721–735
Li Y (2003) Classification in the presence of class noise. Pattern Recogn 5:1–30
Li R-L, Hu Y-F (2003) Noise reduction to text categorization based on density for KNN. In: Proceedings of the 2003 international conference on machine learning and cybernetics (IEEE Cat. No. 03EX693), vol 5. IEEE, pp 3119–3124
Frénay B, Verleysen M (2014) Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst 251:845–869
Folorunsho O (2013) Comparative study of different data mining techniques performance in knowledge discovery from medical database. Int J Adv Res Comput Sci Softw Eng 33:11–15
Kordos M, Rusiecki A (2013) Improving MLP neural network performance by noise reduction. In: International conference on theory and practice of natural computing. Springer, Berlin, pp 133–144
Webb AR (2003) Statistical pattern recognition. Wiley, New York
Juang L-H, Wu M-N (2010) MRI brain lesion image detection based on color-converted k-means clustering segmentation. Measurement 437:941–949
Frank A, Asuncion A (2011) UCI machine learning repository, 2010. http://archive.ics.uci.edu/ml
Smith MR, Martinez T (2013) An extensive evaluation of filtering misclassified instances in supervised classification tasks, vol 11, pp 1312–3970. arXiv preprint arXiv:1312.3970