A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients
Tóm tắt
Từ khóa
Tài liệu tham khảo
W.H. Organization, Globocan 2012: estimated cancer incidence, mortality and prevalence worldwide in 2012. <http://globocan.iarc.fr/>.
W.H. Organization, Cancer fact sheet, 2014. <http://www.who.int/mediacentre/factsheets/fs297>.
Anon., European association for the study of the liver, European organisation for research and treatment of cancer, EASL-EORTC clinical practice guidelines: management of hepatocellular carcinoma, J. Hepatol. 56 (4) (2012) 908–943.
Marinho, 2007, Rising costs and hospital admissions for hepatocellular carcinoma in portugal (1993–2005), World J. Gastroenterol., 13, 1522, 10.3748/wjg.v13.i10.1522
L.P.C. Cancro, Cancro do fígado pode aumentar 70 por cento até, 2015. <http://www.ligacontracancro.pt/noticias/detalhes.php?id=115>.
Burke, 1997, Artificial neural networks improve the accuracy of cancer survival prediction, Cancer, 79, 857, 10.1002/(SICI)1097-0142(19970215)79:4<857::AID-CNCR24>3.0.CO;2-Y
Thongkam, 2009, Toward breast cancer survivability prediction models through improving training space, Expert Syst. Appl., 36, 12200, 10.1016/j.eswa.2009.04.067
Esfandiari, 2014, Knowledge discovery in medicine: current issue and future trend, Expert Syst. Appl., 41, 4434, 10.1016/j.eswa.2014.01.011
Abreu, 2014, Overall survival prediction for women breast cancer using ensemble methods and incomplete clinical data, vol. 41, 1366
Abreu, 2014, Personalizing breast cancer patients with heterogeneous data, vol. 42, 39
Yuan, 1998, Neural-network design for small training sets of high dimension, IEEE Trans. Neural Netw., 9, 266, 10.1109/72.661122
Andonie, 2010, Extreme data mining: Interference from small datasets, Int. J. Comput. Commun. Control, 5, 280, 10.15837/ijccc.2010.3.2481
Harrell, 1996, Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors, Stat. Med., 15, 361, 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4
García-Laencina, 2010, Pattern classification with missing data: a review, Neural Comput. Appl., 19, 263, 10.1007/s00521-009-0295-6
Qi, 2013, On an ensemble algorithm for clustering cancer patient data, BMC Syst. Biol., 7, S9, 10.1186/1752-0509-7-S4-S9
Chawla, 2002, SMOTE: synthetic minority over-sampling technique, J. Artif. Intell. Res., 16, 321, 10.1613/jair.953
Durand, 2005, Assessment of the prognosis of cirrhosis: childpugh versus meld, J. Hepatol., 42, S100, 10.1016/j.jhep.2004.11.015
Cruz, 2006, Applications of machine learning in cancer prediction and prognosis, Cancer Informat., 2, 59, 10.1177/117693510600200030
Wasyluk, 2010, Founding of database for cirrhotic patients for early detection of hepatocellular carcinoma, Hepatology, 6, 13
Ho, 2012, Disease-free survival after hepatic resection in hepatocellular carcinoma patients: a prediction approach using artificial neural network, PLoS ONE, 7, e29179, 10.1371/journal.pone.0029179
H.C. Chiu, T.W. Ho, L.K. T., H.Y. Chen, W.H. Ho, Mortality predicted accuracy for hepatocellular carcinoma patients with hepatic resection using artificial neural network, Sci. World J. 2013 (2013) 201976–10.
Shi, 2012, Comparison of artificial neural network and logistic regression models for predicting in-hospital mortality after primary liver cancer surgery, PLoS ONE, 7, e35781, 10.1371/journal.pone.0035781
Little, 2002
Cismondi, 2013, Missing data in medical databases: impute, delete or classify?, Artif. Intell. Med., 58, 63, 10.1016/j.artmed.2013.01.003
García-Laencina, 2009, K nearest neighbours with mutual information for simultaneous classification and missing data imputation, Neurocomputing, 72, 1483, 10.1016/j.neucom.2008.11.026
García-Laencina, 2013, Classifying patterns with missing values using multi-task learning perceptrons, Expert Syst. Appl., 40, 1333, 10.1016/j.eswa.2012.08.057
García-Laencina, 2015, Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values, Comput. Biol. Med., 59, 125, 10.1016/j.compbiomed.2015.02.006
Little, 1999, Methods for handling missing values in clinical trials, J. Rheumatol., 26, 1654
Troyanskaya, 2001, Missing value estimation methods for DNA microarrays, Bioinformatics, 17, 520, 10.1093/bioinformatics/17.6.520
Jerez, 2010, Missing data imputation using statistical and machine learning methods in a real breast cancer problem, Artif. Intell. Med., 50, 105, 10.1016/j.artmed.2010.05.002
Batista, 2003, An analysis of four missing data treatment methods for supervised learning, Appl. Artif. Intell., 17, 519, 10.1080/713827181
Suarez-Alvarez, 2012, Statistical approach to normalization of feature vectors and clustering of mixed datasets, Proc. Roy. Soc. London A: Math. Phys. Eng. Sci., 468, 2630, 10.1098/rspa.2011.0704
Tibshirani, 2001, Estimating the number of clusters in a data set via the gap statistic, J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.), 63, 411, 10.1111/1467-9868.00293
Jain, 2010, Data clustering: 50 years beyond k-means, Pattern Recogn. Lett., 31, 651, 10.1016/j.patrec.2009.09.011
Chauhan, 2010, Data clustering method for discovering clusters in spatial cancer databases, Int. J. Comput. Appl., 10, 9
Winkler, 2013, An integrated clustering and classification approach for the analysis of tumor patient data, vol. 8111, 388
He, 2009, Learning from imbalanced data, IEEE Trans. Knowl. Data Eng., 21, 1263, 10.1109/TKDE.2008.239
Bishop, 2006
D. Arthur, S. Vassilvitskii, K-means++: the advantages of careful seeding, in: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’07, 2007, pp. 1027–1035.
Dudoit, 2003, Bagging to improve the accuracy of a clustering procedure, Bioinformatics, 19, 1090, 10.1093/bioinformatics/btg038
Vega-Pons, 2011, A survey of clustering ensembles, Int. J. Pattern Recogn. Artif. Intell., 25, 337, 10.1142/S0218001411008683
Yang, 2014, Exploring the diversity in cluster ensemble generation: random sampling and random projection, Expert Syst. Appl., 41, 4844, 10.1016/j.eswa.2014.01.028
Yu, 2014, Probabilistic cluster structure ensemble, Inform. Sci., 267, 16, 10.1016/j.ins.2014.01.030
de Vries, 1986, Stratified random sampling, 31
Huang, 2005, Using auc and accuracy in evaluating learning algorithms, IEEE Trans. Knowl. Data Eng., 17, 290
Demšar, 2006, Statistical comparisons of classifiers over multiple data sets, J. Machine Learning Res., 7, 1