Parallel reduced multi-class contour preserving classification
Tóm tắt
Multi-class contour preserving classification is a contour conservancy technique that synthesizes two types of vectors; fundamental multi-class outpost vectors (FMCOVs) and additional multi-class outpost vectors (AMCOVs), at the judging border between classes of data to improve the classification accuracy of the feed-forward neural network. However, the number of both new vectors is tremendous, resulting in a significantly prolonged training time. Reduced multi-class contour preserving classification provides three practical methods to lessen the number of FMCOVs and AMCOVs. Nevertheless, the three reduced multi-class outpost vector methods are serial and therefore have limited applicability on modern machines with multiple CPU cores or processors. This paper presents the methodologies and the frameworks of the three parallel reduced multi-class outpost vector methods that can effectively utilize thread-level parallelism and process-level parallelism to (1) substantially lessen the number of FMCOVs and AMCOVs, (2) efficiently increase the speedups in execution times to be proportional to the number of available CPU cores or processors, and (3) significantly increase the classification performance (accuracy, precision, recall, and F1 score) of the feed-forward neural network. The experiments carried out on the balanced and imbalanced real-world multi-class data sets downloaded from the UCI machine learning repository confirmed the reduction performance, the speedups, and the classification performance aforementioned.
Tài liệu tham khảo
Haykin S (1999) Neural networks : a comprehensive foundation, 2nd edn. Prentice Hall, Upper Saddle River
Russell S, Norving P (2004) Artificial intelligence a modern approach, 2nd edn. Pearson Education, Delhi
Negnevitsky M (2005) Artificial intelligence: a guide to intelligent systems, 2nd edn. Addison-Wesley, Essex
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. Waltham, Morgan Kaufmann
Habib M, Liew C, Abbas A, Jayaraman P, Wah T, Khan S (2016) Big data reduction methods: a survey. Data Sci Eng, pp 1–20. https://doi.org/10.1007/s41019-016-0022-0
Raymer M, Punch W, Goodman E, Kuhn L, Jain A (2000) Dimensionality reduction using genetic algorithms. IEEE Trans on Evolut Comput 4(2):164–171. https://doi.org/10.1109/4235.850656
Pal N, Eluri V, Mandal G (2002) Fuzzy logic approaches to structure preserving dimensionality reduction. IEEE Trans on Fuzzy Syst 10(3):277–286. https://doi.org/10.1109/TFUZZ.2002.1006431
Parthaláin N, Shen Q, Jensen R (2009) A distance measure approach to exploring the rough set boundary region for attribute reduction. IEEE Trans Knowl Data Eng 22(3):305–317. https://doi.org/10.1109/TKDE.2009.119
Li Y, Hom H, Shiu S, Pal P (2006) Combining feature reduction and case selection in building CBR classifiers. IEEE Trans Knowl Data Eng 18(3):415–429. https://doi.org/10.1109/TKDE.2006.40
Hino H, Murata N (2010) A conditional entropy minimization criterion for dimensionality reduction and multiple kernel learning. Neural Comput 22(11):2887–2923. https://doi.org/10.1162/NECO_a_00027
Dudek G (2012) An artificial immune system for classification with local feature selection. IEEE Trans Evol Comput 16(6):847–860. https://doi.org/10.1109/TEVC.2011.2173580
Vervliet N, Debals O, Sorber L, Lathauwer L (2014) Breaking the curse of dimensionality using decompositions of incomplete tensors: tensor-based scientific computing in big data analysis. IEEE Signal Process Mag 31(5):71–79. https://doi.org/10.1109/MSP.2014.2329429
Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1):59–69. https://doi.org/10.1007/BF00337288
Dasarathy BV, Sánchez JS, Townsend S (2000) Nearest neighbor editing and condensing tools-synergy exploitation. Pattern Anal Appl 3(1):19–30. https://doi.org/10.1007/s100440050003
Brighton H, Mellish C (2002) Advances in instance selection for instance-based learning algorithms. Data Min Knowl Discov 6(2):153–172. https://doi.org/10.1023/A:1014043630878
Yang C, Zhanga X, Zhongb C, Liua C, Peic J, Ramamohanaraod K, Chena J (2014) A spatiotemporal compression based approach for efficient big data processing on Cloud. J Computer Syst Sci 80 (8):1563–1583. https://doi.org/10.1016/j.jcss.2014.04.022
Tanprasert T, Tanprasert C, Lursinsap C (1998) Contour preserving classification for maximal reliability. In: Proceedings of the 1998 international joint conference on neural networks, pp 1125–1130. https://doi.org/10.1109/IJCNN.1998.685930
Fuangkhon P (2014) An incremental learning preprocessor for feed-forward neural network. Artif Intell Rev 41(2):183–210. https://doi.org/10.1007/s10462-011-9304-0
Fuangkhon P, Tanprasert T (2012) Multi-class contour preserving classification. In: International conference on intelligent data engineering and automated learning, pp 35–42. https://doi.org/10.1007/978-3-642-32639-4_5
Fuangkhon P, Tanprasert T (2016) Reduced multi-class contour preserving classification. Neural Process Lett 43(3):759–804. https://doi.org/10.1007/s11063-015-9446-1
Fuangkhon P (2017) Parallel multi-class contour preserving classification. J Intell Syst 26(1):109–121. https://doi.org/10.1515/jisys-2015-0038
Sokolovaa M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437. https://doi.org/10.1016/j.ipm.2009.03.002
Moro S, Cortez P, Rita P (2014) Bank marketing. UCI Machine Learning Repository. < https://archive.ics.uci.edu/ml/datasets/Bank+Marketing > Accessed 25.03.16
Yeh I (2016) Default of Credit Card Clients. UCI Machine Learning Repository. < https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients > Accessed 25.03.16
Rajkovic V (1997) Nursery. UCI Machine Learning Repository. < https://archive.ics.uci.edu/ml/datasets/Nursery > Accessed 25. 03.16
Alpaydin E, Kaynak C (1998) Optical recognition of handwritten digits. UCI Machine Learning Repository. < https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits > Accessed 25.03.16
Alpaydin E, Alimoglu F (1998) Pen-based recognition of handwritten digits. UCI Machine Learning Repository. < http://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+Digits > Accessed 25.03.16
Cattral R, Oppacher F (2007) Poker hand. UCI Machine Learning Repository. < https://archive.ics.uci.edu/ml/datasets/Poker+Hand > Accessed 25.03.16
Srinivasan A (1993) Statlog (Landsat Satellite). UCI Machine Learning Repository. < https://archive.ics.uci.edu/ml/datasets/Statlog+(Landsat+Satellite) > Accessed 25.03.16
Catlett J (1991) Statlog (Shuttle). UCI Machine Learning Repository. < https://archive.ics.uci.edu/ml/datasets/Statlog+(Shuttle) > Accessed 25.03.16
(2016) Intel Corporation. < http://ark.intel.com/products/75123/Intel-Core-i7-4770K-Processor-8M-Cache-up-to-3_90-GHz > Accessed 25.03.16
(2016) Intel Corporation. < http://www.intel.com/content/www/us/en/architecture-and-technology/hyper-threading/hyper-threading-technology.html > Accessed 25.03.16