Một bộ phân loại lai dựa trên máy hỗ trợ vector và thuật toán Jaya cho phân loại ung thư vú

Mohammed Alshutbi1, Zhiyong Li1, Moath Alrifaey2, Masoud Ahmadipour3, Muhammad Murtadha Othman3
1College of Information Science and Engineering, Hunan University, Changsha, China
2Department of Mechanical and Manufacturing Engineering, Faculty of Engineering, Universiti Putra Malaysia, Serdang, Malaysia
3School of Electrical Engineering, College of Engineering, Universiti Teknologi MARA, Shah Alam, Selangor, Malaysia

Tóm tắt

Quyết định của các chuyên gia và việc đánh giá dữ liệu của bệnh nhân là những phần quan trọng nhất ảnh hưởng đến phân tích ung thư vú. Để phát hiện sớm ung thư vú, nhiều kỹ thuật học máy không chỉ hỗ trợ trong việc kiểm tra và chẩn đoán nhanh chóng dữ liệu y tế mà còn giảm thiểu các lỗi tiềm ẩn có thể xảy ra do các nhà quyết định thiếu kinh nghiệm hoặc không có kỹ năng. Máy hỗ trợ vector (SVM) là một trong những bộ phân loại nổi tiếng đã đóng góp quan trọng cho lĩnh vực phân loại ung thư. Tuy nhiên, cấu hình của các hàm lõi khác nhau và các tham số của chúng có thể ảnh hưởng đáng kể đến hiệu suất của bộ phân loại SVM. Để cải thiện hơn nữa độ chính xác phân loại của bộ phân loại SVM cho chẩn đoán ung thư vú, một phương pháp phân loại ung thư thông minh được đề xuất dựa trên việc chọn một tập hợp đặc trưng và tối ưu hóa các tham số liên quan (tức là tham số hệ số hình phạt ($$c$$) và tham số hàm lõi ($$ \gamma$$) của bộ phân loại SVM đồng thời thông qua một thuật toán thông minh sử dụng thuật toán Jaya. Sau đó, phương pháp này (Jaya-SVM) được áp dụng để xác định chính xác tập dữ liệu ung thư vú, bao gồm 699 mẫu, trong đó 458 mẫu là u lành và 241 mẫu là u ác tính. Hơn nữa, để đánh giá hiệu quả của bộ phân loại Jaya-SVM được đề xuất, nó được so sánh về mức độ phức tạp tính toán và độ chính xác phân loại với một số bộ phân loại metaheuristic tổ hợp khác, bao gồm thuật toán di truyền (GA), tiến hóa vi phân (DE), tối ưu hóa bầy đàn (PSO) và bộ phân loại SVM dựa trên tìm kiếm chim cú (CS). Bên cạnh đó, tập dữ liệu ung thư vú Coimbra lấy từ thư viện UCI được sử dụng để xác thực hiệu quả của phương pháp được đề xuất. Các kết quả được trình bày, giải thích và các kết luận được rút ra.

Từ khóa

#ung thư vú #máy hỗ trợ vector #thuật toán Jaya #phân loại ung thư #học máy

Tài liệu tham khảo

Randi G et al. (2020) Estimated cancer incidence and mortality in europe for the year 2020 Eur J Public Health 30(5): ckaa166.1348, https://doi.org/10.1093/eurpub/ckaa166.1348.

Dafni U, Tsourti Z, Alatsathianos I (2019) Breast cancer statistics in the European union: incidence and survival across European countries. Breast Care 14(6):344–353. https://doi.org/10.1159/000503219

Ferlay J et al (2018) Cancer incidence and mortality patterns in Europe: estimates for 40 countries and 25 major cancers in 2018. Eur J Cancer 103:356–387. https://doi.org/10.1016/j.ejca.2018.07.005

Carioli G, Malvezzi M, Rodriguez T, Bertuccio P, Negri E, La Vecchia C (2017) Trends and predictions to 2020 in breast cancer mortality in Europe. Breast 36:89–95. https://doi.org/10.1016/j.breast.2017.06.003

Bruni D, Angell HK, Galon J (2020) The immune contexture and immunoscore in cancer prognosis and therapeutic efficacy. Nat Rev Cancer 20(11):662–680. https://doi.org/10.1038/s41568-020-0285-7

Yala A, Lehman C, Schuster T, Portnoi T, Barzilay R (2019) A deep learning mammography-based model for improved breast cancer risk prediction. Radiology 292(1):60–66. https://doi.org/10.1148/radiol.2019182716

Kim H-E et al (2020) Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. Lancet Digital Health 2(3):e138–e148. https://doi.org/10.1016/S2589-7500(20)30003-0

Sloun RJGV, Cohen R, Eldar YC (2020) Deep learning in ultrasound imaging, Proceedings of the IEEE 108(1): 11–29 https://doi.org/10.1109/JPROC.2019.2932116

Soo MS, Shelby RA, Johnson KS (2019) Optimizing the patient experience during breast biopsy. J Breast Imag 1(2):131–138. https://doi.org/10.1093/jbi/wbz001

Keleş A, Keleş A, Yavuz U (2011) Expert system based on neuro-fuzzy rules for diagnosis breast cancer. Expert Syst Appl 38(5):5719–5726. https://doi.org/10.1016/j.eswa.2010.10.061

Huang S, Yang J, Fong S, Zhao Q (2020) Artificial intelligence in cancer diagnosis and prognosis: opportunities and challenges. Cancer Lett 471:61–71. https://doi.org/10.1016/j.canlet.2019.12.007

Amrane M, Oukid S, Gagaoua I, Ensarİ T (2018) Breast cancer classification using machine learning, in 2018 Electric Electronics, Computer Science, Biomedical Engineerings' Meeting (EBBT), 18–19 pp 1–4, https://doi.org/10.1109/EBBT.2018.8391453.

Ganggayah MD, Taib NA, Har YC, Lio P, Dhillon SK (2019) Predicting factors for survival of breast cancer patients using machine learning techniques. BMC Med Informatics Decision Mak 19(1):48. https://doi.org/10.1186/s12911-019-0801-4

Obaid OI, Mohammed MA, Mostafa A, Taha F (2018) Evaluating the performance of machine learning techniques in the classification of Wisconsin Breast Cancer Int. J Eng Technol 7(436):160–166. https://doi.org/10.14419/ijet.v7i4.36.23737

AlFayez F, El-Soud MWA, Gaber T (2020) Thermogram breast cancer detection: a comparative study of two machine learning techniques. Appl Sci 10(2):2020. https://doi.org/10.3390/app10020551

Khan S, Islam N, Jan Z, Ud Din I, Rodrigues JJPC (2019) A novel deep learning based framework for the detection and classification of breast cancer using transfer learning. Pattern Recognition Lett 125:1–6. https://doi.org/10.1016/j.patrec.2019.03.022

Wang Y, Yang F, Zhang J, Wang H, Yue X, Liu S (2021) Application of artificial intelligence based on deep learning in breast cancer screening and imaging diagnosis. Neural Comput Appl 33(15):9637–9647. https://doi.org/10.1007/s00521-021-05728-x

Mohanty AK, Senapati MR, Lenka SK (2016) Retraction note to: an improved data mining technique for classification and detection of breast cancer from mammograms. Neural Comput Appl 27(1):249–249. https://doi.org/10.1007/s00521-015-2083-9

Jafari-Marandi R, Davarzani S, Soltanpour Gharibdousti M, Smith BK (2018) An optimum ANN-based breast cancer diagnosis: bridging gaps between ANN learning and decision-making goals. Appl Soft Comput 72:108–120. https://doi.org/10.1016/j.asoc.2018.07.060

Almasi ON, Khooban MH (2018) A parsimonious SVM model selection criterion for classification of real-world data sets via an adaptive population-based algorithm. Neural Comput Appl 30(11):3421–3429. https://doi.org/10.1007/s00521-017-2930-y

Huang M-W, Chen C-W, Lin W-C, Ke S-W, Tsai C-F (2017) SVM and SVM ensembles in breast cancer prediction. PLoS ONE 12(1):e0161501. https://doi.org/10.1371/journal.pone.0161501

Abdar M, Makarenkov V (2019) CWV-BANN-SVM ensemble learning classifier for an accurate diagnosis of breast cancer. Measurement 146:557–570. https://doi.org/10.1016/j.measurement.2019.05.022

Vijayarajeswari R, Parthasarathy P, Vivekanandan S, Basha AA (2019) Classification of mammogram for early detection of breast cancer using SVM classifier and Hough transform. Measurement 146:800–805. https://doi.org/10.1016/j.measurement.2019.05.083

Saturi R, Phani KVS and Chand PPP (2021) A frame work to detect breast cancer using KNN and SVM, (in en), Eur J Mol Clinic Med 8(3): 1432–1438 [Online]. Available: https://ejmcm.com/article_9974.html

Kaveh A, Bakhshpoori T (2019) Metaheuristics outlines: MATLAB codes and examples. Springer, New York

Chen Y, Fan L, Bai Y, Feng J, Sareh P (2020) Assigning mountain-valley fold lines of flat-foldable origami patterns based on graph theory and mixed-integer linear programming. Computer Struct 239:106328. https://doi.org/10.1016/j.compstruc.2020.106328

Chen Y, Sareh P, Feng J, Sun Q (2017) A computational method for automated detection of engineering structures with cyclic symmetries. Computers Struct 191:153–164. https://doi.org/10.1016/j.compstruc.2017.06.013

Alrifaey M, Sai Hong T, Asarry A, Elianddy Supeni E, Ang CK (2020) Optimization and selection of maintenance policies in an electrical gas turbine generator based on the hybrid reliability-centered maintenance (RCM) model. Processes 8(6):670. https://doi.org/10.3390/pr8060670

Mehedi IM et al (2021) Optimal feature selection using modified cuckoo search for classification of power quality disturbances. Appl Soft Comput https://doi.org/10.1016/j.asoc.2021.107897

Alrifaey M et al (2022) Hybrid deep learning model for fault detection and classification of grid-connected photovoltaic system. IEEE Access. https://doi.org/10.1109/ACCESS.2022.3140287

Xu H, Chen T, Lv J, Guo J (2016) A combined parallel genetic algorithm and support vector machine model for breast cancer detection. J Comput Methods Sci Eng 16:773–785. https://doi.org/10.3233/JCM-160690

Hamouda S, Hassan A, Wahed ME, Ail M, Farouk O (2020) Tuning to optimize SVM approach for breast cancer diagnosis with blood analysis data. SSRN Electron J. https://doi.org/10.2139/ssrn.3537067

Chen K-H, Wang K-J, Wang K-M, Angelia M-A (2014) Applying particle swarm optimization-based decision tree classifier for cancer classification on gene expression data. Appl Soft Comput 24:773–780. https://doi.org/10.1016/j.asoc.2014.08.032

Zhang Y-D, Satapathy SC, Guttery DS, Górriz JM, Wang S-H (2021) Improved breast cancer classification through combining graph convolutional network and convolutional neural network. Information Process Manag 58(2):102439. https://doi.org/10.1016/j.ipm.2020.102439

Nilashi M, bin Ibrahim O, Ithnin N, Sarmin NH (2015) A multi-criteria collaborative filtering recommender system for the tourism domain using Expectation Maximization (EM) and PCA–ANFIS Electron Commer Res Appl 14(6): 542–562 https://doi.org/10.1016/j.elerap.2015.08.004

Shuo W, Ming M (2021) Exploring online intelligent teaching method with machine learning and SVM algorithm. Neural Comput Appl. https://doi.org/10.1007/s00521-021-05846-6

Yu H, Kim S (2012) SVM tutorial-classification, regression and ranking. Handbook Nat Comput 1:479–506. https://doi.org/10.1007/978-3-540-92910-9_15

Rao R (2016) Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. Int J Ind Eng Comput 7(1):19–34. https://doi.org/10.5267/j.ijiec.2015.8.004

Aličković E, Subasi A (2017) Breast cancer diagnosis using GA feature selection and rotation forest. Neural Comput Appl 28(4):753–763. https://doi.org/10.1007/s00521-015-2103-9

Yang L, Xu Z (2019) Feature extraction by PCA and diagnosis of breast tumors using SVM with DE-based parameter tuning. Int J Mach Learn Cybern 10(3):591–601. https://doi.org/10.1007/s13042-017-0741-1

Habibi R (2021) Svm performance optimization using PSO for breast cancer classification Budapest Int Res Exact Sci (BirEx) J 3(1): 741–754 https://doi.org/10.33258/birex.v3i1.1499

S C S R and Rajaguru H (2019) Comparison analysis of linear discriminant analysis and cuckoo-search algorithm in the classification of breast cancer from digital mammograms (in eng) Asian Pac J Cancer Prev 20(8): 2333–2337 https://doi.org/10.31557/APJCP.2019.20.8.2333.

Dua D and Graff C UCI machine learning repository [Online] Available: https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.names

Das H, Naik B, Behera HS (2020) A Jaya algorithm based wrapper method for optimal feature selection in supervised classification. J King Saud Univ Computer Information Sci. https://doi.org/10.1016/j.jksuci.2020.05.002

Alhayali RAI, Ahmed MA, Mohialden YM, Ali AH (2020) Efficient method for breast cancer classification based on ensemble hoffeding tree and naïve Bayes. Indonesian J Electric Eng Computer Sci 18(2):1074–1080. https://doi.org/10.11591/ijeecs.v18.i2.pp1074-1080

Galván-Tejada CE et al (2017) Multivariate feature selection of image descriptors data for breast cancer with computer-assisted diagnosis. Diagnostics. https://doi.org/10.3390/diagnostics7010009

Murata T et al (2019) Salivary metabolomics with alternative decision tree-based machine learning methods for breast cancer discrimination. Breast Cancer Res Treat 177(3):591–601. https://doi.org/10.1007/s10549-019-05330-9

Aslan MF, Celik Y, Sabanci K, Durdu A (2018) Breast cancer diagnosis by different machine learning methods using blood analysis data. Int J Intell Syst Appl Eng 6(4):289–293. https://doi.org/10.18201/ijisae.2018648455

Silva Araújo VJ, Guimarães AJ, de Campos Souza PV, Rezende TS, Araújo VS (2019) Using resistin, glucose, age and BMI and pruning fuzzy neural network for the construction of expert systems in the prediction of breast cancer. Mach Learn Knowl Extr. https://doi.org/10.3390/make1010028

Fijri AL and Rustam Z (2018) Comparison between fuzzy kernel C-Means and sparse learning fuzzy C-means for breast cancer clustering in 2018 International Conference on Applied Information Technology and Innovation (ICAITI), 3–5 Sept. 2018 2018, pp 158–161https://doi.org/10.1109/ICAITI.2018.8686707

Austria YD, Goh ML, Sta Maria Jr L, Lalata J-A, Goh JE, Vicente H (2019) Comparison of machine learning algorithms in breast cancer prediction using the coimbra dataset. Int J Simul Syst Sci Technol. https://doi.org/10.5013/IJSSST.a.20.S2.23

Karthik S, Srinivasa Perumal R, and Chandra Mouli PVSSR (2018) Breast cancer classification using deep neural networks, in Knowledge Computing and Its Applications: Springer, ch. Chapter 12, pp 227–241