A hybrid Bayesian network and tensor factorization approach for missing value imputation to improve breast cancer recurrence prediction

Mahin Vazifehdan1, Mohammad Hossein Moattar1, Mehrdad Jalali1
1Department of Software Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran

Tài liệu tham khảo

Acar, 2009, Unsupervised multiway data analysis: a literature survey, IEEE Trans. Knowl. Data Eng., 21, 6, 10.1109/TKDE.2008.112 Acar E. et. al, “Scalable Tensor Factorizations with Missing Data,” October, no. October, pp. 701–712, 2009. Aydilek, 2013, A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm, Inf. Sci. (Ny), 233, 25, 10.1016/j.ins.2013.01.021 Bader B.W., Kolda T.G., and others, “MATLAB Tensor Toolbox Version 2.6.” Feb-2015. Batista, 2003, An analysis of four missing data treatment methods for supervised learning, Appl. Artif. Intell., 17, 519, 10.1080/713827181 Bishop C.M., Pattern Recognition and Machine Learning, vol. 53, no. 9. 2013. Blake C., Merz C.J., “Repository of machine learning databases,” 1998. Buyse, 2006, Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer, J. Natl. Cancer Inst., 98, 1183, 10.1093/jnci/djj329 Cai Z., Heydari M., Lin G., “Missing value imputation,” vol. 4, no. 5, pp. 935–957, 2006. Choi, 2010, Cardiac sound murmurs classification with autoregressive spectral analysis and multi-support vector machine technique, Comput. Biol. Med., 40, 8, 10.1016/j.compbiomed.2009.10.003 Clark, 1989, The {CN}2 rule induction algorithm, Mach. Learn., 3, 261, 10.1007/BF00116835 Cohen, 1995, 115 Cortes, 1995, Support-vector networks, Mach. Learn., 20, 273, 10.1007/BF00994018 Cover, 1967, Nearest neighbor pattern classification, IEEE Trans. Inf. Theory, 13, 21, 10.1109/TIT.1967.1053964 Dauwels, 2012, Tensor Factorizat. Missing Data Imput., 2109 De Campos C.P. “Properties of Bayesian Dirichlet Scores to Learn Bayesian Network Structures,” pp. 431–436. Dempster, 1977, Maximum likelihood from incomplete data via the EM algorithm, J Roy Stat Soc Ser B, 39, 1 Dunlavy D.M., Kolda T.G., Acar E., “Poblano v1.0: A Matlab Toolbox for Gradient-Based Optimization,” Mar. 2010. Farhangfar, 2008, Impact of imputation of missing values on classification error for discrete data, Pattern Recognit., 41, 3692, 10.1016/j.patcog.2008.05.019 Franzin, 2017, bnstruct: an R package for Bayesian Network structure learning in the presence of missing data, Bioinformatics, 33, 1250, 10.1093/bioinformatics/btw807 García-Laencina, 2015, Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values, Comput. Biol. Med., 59, 125, 10.1016/j.compbiomed.2015.02.006 Hitchcock, 1927, The expression of a tensor or a polyadic as a sum of products, J. Math. Phys., 6, 164, 10.1002/sapm192761164 Jaakkola T., Meila M. “Learning Bayesian Network Structure using LP Relaxations,” vol. 9, pp. 358–365, 2010. Jerez, 2010, Missing data imputation using statistical and machine learning methods in a real breast cancer problem, Artif. Intell. Med., 50, 105, 10.1016/j.artmed.2010.05.002 Jerez-Aragonés, 2003, A combined neural network and decision trees model for prognosis of breast cancer relapse, Artif. Intell. Med., 27, 45, 10.1016/S0933-3657(02)00086-6 Kim, 2012, Development of novel breast cancer recurrence prediction model using support vector machine, J. Breast Cancer, 15, 230, 10.4048/jbc.2012.15.2.230 Kohonen, 1995, Self organizing maps, Springer Ser. Inf. Sci., 30, 521 Little, 2002, Statistical Analysis with Missing Data, 10.1002/9781119013563 Malarvizhi M. R., Thanamani A.S. “K-Nearest Neighbor in Missing Data Imputation,” vol. 5, no. 1, pp. 5–7, 2012. Ng A.Y., “Feature selection, L1 vs. L2 regularization, and rotational invariance,” Twenty-first Int. Conf. Mach. Learn. - ICML ’04, p. 78, 2004. Purwar, 2015, Hybrid prediction model with missing value imputation for medical data, Expert Syst. Appl., 42, 5621, 10.1016/j.eswa.2015.02.050 Rancoita, 2014, Bayesian network data imputation with application to survival tree analysis, Comput. Stat. Data Anal., 93, 373, 10.1016/j.csda.2014.12.008 Sharfian, 2015, Burden of breast cancer in iranian women is increasing, Asian Pacific J. Cancer Prevent., 16, 5049, 10.7314/APJCP.2015.16.12.5049 Silander T., Myllym P. “A Simple Approach for Finding the Globally Optimal Bayesian Network Structure,” 2004. SolaroEmail, 2017, A sequential distance-based approach for imputing missing data: forward Imputation, Adv. Data Analysis Classif., 11, 395, 10.1007/s11634-016-0243-0 Sun, 2010, Derivation of molecular signatures for breast cancer recurrence prediction using a two-way validation approach, Breast Cancer Res. Treat., 119, 593, 10.1007/s10549-009-0365-6 Troyanskaya, 2001, Missing value estimation methods for DNA microarrays, Bioinformatics, 17, 520, 10.1093/bioinformatics/17.6.520 Tutz, 2015, Improved methods for the imputation of missing data by nearest neighbor methods, Comput. Stat. Data Anal., 90, 84, 10.1016/j.csda.2015.04.009 Van't Veer L.J. et al., “Gene expression profiling predicts clinical outcome of breast cancer,” vol. 415, no. 345, 2002. Vapnik, 1996, Support vector method for function approximation, regression estimation, and signal processing, Annu. Conf. Neural Inf. Process. Syst., 281 Vidyasagar, 2017, Machine learning methods in computational cancer biology, Annu. Rev. Control, 1 Wang, 2014, A hybrid classifier combining SMOTE with PSO to estimate 5-year survivability of breast cancer patients, Appl. Soft Comput. J., 20, 15, 10.1016/j.asoc.2013.09.014 Wang, 2017, semantically enhanced medical information retrieval system: a tensor factorization based approach, IEEE Access, 5, 7584, 10.1109/ACCESS.2017.2698142 Wilin, 2009, Gene Select. Cancer Classif., 389 Yang, 2017, LFTF: a framework for efficient tensor analytics at scale, Proc. VLDB Endowment, 10, 745, 10.14778/3067421.3067424 Zheng, 2014, Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms, Expert Syst. Appl., 41, 1476, 10.1016/j.eswa.2013.08.044