A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients

Journal of Biomedical Informatics - Tập 58 - Trang 49-59 - 2015

Miriam Seoane Santos^1,2, Pedro Henriques Abreu^1,2, Pedro J. García-Laencina³, Adélia Simão⁴, Armando Carvalho⁴

¹Centre for Informatics and Systems, University of Coimbra, Pólo II, Pinhal de Marrocos, 3030-290 Coimbra, Portugal

²Department of Informatics Engineering, Faculty of Sciences and Technology, University of Coimbra, Pólo II, Pinhal de Marrocos, 3030-290 Coimbra, Portugal

³Centro Universitario de la Defensa de San Javier (University Centre of Defence at the Spanish Air Force Academy), MDE-UPCT, Calle Coronel López Peña, s/n, 30720 Santiago de la Ribera, Murcia, Spain

⁴Internal Medicine Service, Hospital and University Centre of Coimbra, EPE, Rua Fonseca Pinto, 3000-075 Coimbra, Portugal

Tóm tắt

Từ khóa

Tài liệu tham khảo

W.H. Organization, Globocan 2012: estimated cancer incidence, mortality and prevalence worldwide in 2012. <http://globocan.iarc.fr/>.

W.H. Organization, Cancer fact sheet, 2014. <http://www.who.int/mediacentre/factsheets/fs297>.

Anon., European association for the study of the liver, European organisation for research and treatment of cancer, EASL-EORTC clinical practice guidelines: management of hepatocellular carcinoma, J. Hepatol. 56 (4) (2012) 908–943.

Marinho, 2007, Rising costs and hospital admissions for hepatocellular carcinoma in portugal (1993–2005), World J. Gastroenterol., 13, 1522, 10.3748/wjg.v13.i10.1522

L.P.C. Cancro, Cancro do fígado pode aumentar 70 por cento até, 2015. <http://www.ligacontracancro.pt/noticias/detalhes.php?id=115>.

Burke, 1997, Artificial neural networks improve the accuracy of cancer survival prediction, Cancer, 79, 857, 10.1002/(SICI)1097-0142(19970215)79:4<857::AID-CNCR24>3.0.CO;2-Y

Thongkam, 2009, Toward breast cancer survivability prediction models through improving training space, Expert Syst. Appl., 36, 12200, 10.1016/j.eswa.2009.04.067

Esfandiari, 2014, Knowledge discovery in medicine: current issue and future trend, Expert Syst. Appl., 41, 4434, 10.1016/j.eswa.2014.01.011

Abreu, 2014, Overall survival prediction for women breast cancer using ensemble methods and incomplete clinical data, vol. 41, 1366

Abreu, 2014, Personalizing breast cancer patients with heterogeneous data, vol. 42, 39

Yuan, 1998, Neural-network design for small training sets of high dimension, IEEE Trans. Neural Netw., 9, 266, 10.1109/72.661122

Andonie, 2010, Extreme data mining: Interference from small datasets, Int. J. Comput. Commun. Control, 5, 280, 10.15837/ijccc.2010.3.2481

Harrell, 1996, Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors, Stat. Med., 15, 361, 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4

García-Laencina, 2010, Pattern classification with missing data: a review, Neural Comput. Appl., 19, 263, 10.1007/s00521-009-0295-6

Qi, 2013, On an ensemble algorithm for clustering cancer patient data, BMC Syst. Biol., 7, S9, 10.1186/1752-0509-7-S4-S9

Chawla, 2002, SMOTE: synthetic minority over-sampling technique, J. Artif. Intell. Res., 16, 321, 10.1613/jair.953

Forner, 2012, Hepatocellular carcinoma, Lancet, 379, 1245, 10.1016/S0140-6736(11)61347-0

Durand, 2005, Assessment of the prognosis of cirrhosis: childpugh versus meld, J. Hepatol., 42, S100, 10.1016/j.jhep.2004.11.015

Cruz, 2006, Applications of machine learning in cancer prediction and prognosis, Cancer Informat., 2, 59, 10.1177/117693510600200030

Wasyluk, 2010, Founding of database for cirrhotic patients for early detection of hepatocellular carcinoma, Hepatology, 6, 13

Ho, 2012, Disease-free survival after hepatic resection in hepatocellular carcinoma patients: a prediction approach using artificial neural network, PLoS ONE, 7, e29179, 10.1371/journal.pone.0029179

H.C. Chiu, T.W. Ho, L.K. T., H.Y. Chen, W.H. Ho, Mortality predicted accuracy for hepatocellular carcinoma patients with hepatic resection using artificial neural network, Sci. World J. 2013 (2013) 201976–10.

Shi, 2012, Comparison of artificial neural network and logistic regression models for predicting in-hospital mortality after primary liver cancer surgery, PLoS ONE, 7, e35781, 10.1371/journal.pone.0035781

Little, 2002

Cismondi, 2013, Missing data in medical databases: impute, delete or classify?, Artif. Intell. Med., 58, 63, 10.1016/j.artmed.2013.01.003

García-Laencina, 2009, K nearest neighbours with mutual information for simultaneous classification and missing data imputation, Neurocomputing, 72, 1483, 10.1016/j.neucom.2008.11.026

García-Laencina, 2013, Classifying patterns with missing values using multi-task learning perceptrons, Expert Syst. Appl., 40, 1333, 10.1016/j.eswa.2012.08.057

García-Laencina, 2015, Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values, Comput. Biol. Med., 59, 125, 10.1016/j.compbiomed.2015.02.006

Little, 1999, Methods for handling missing values in clinical trials, J. Rheumatol., 26, 1654

Troyanskaya, 2001, Missing value estimation methods for DNA microarrays, Bioinformatics, 17, 520, 10.1093/bioinformatics/17.6.520

Jerez, 2010, Missing data imputation using statistical and machine learning methods in a real breast cancer problem, Artif. Intell. Med., 50, 105, 10.1016/j.artmed.2010.05.002

Batista, 2003, An analysis of four missing data treatment methods for supervised learning, Appl. Artif. Intell., 17, 519, 10.1080/713827181

Suarez-Alvarez, 2012, Statistical approach to normalization of feature vectors and clustering of mixed datasets, Proc. Roy. Soc. London A: Math. Phys. Eng. Sci., 468, 2630, 10.1098/rspa.2011.0704

Tibshirani, 2001, Estimating the number of clusters in a data set via the gap statistic, J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.), 63, 411, 10.1111/1467-9868.00293

Jain, 2010, Data clustering: 50 years beyond k-means, Pattern Recogn. Lett., 31, 651, 10.1016/j.patrec.2009.09.011

Chauhan, 2010, Data clustering method for discovering clusters in spatial cancer databases, Int. J. Comput. Appl., 10, 9

Winkler, 2013, An integrated clustering and classification approach for the analysis of tumor patient data, vol. 8111, 388

He, 2009, Learning from imbalanced data, IEEE Trans. Knowl. Data Eng., 21, 1263, 10.1109/TKDE.2008.239

Bishop, 2006

D. Arthur, S. Vassilvitskii, K-means++: the advantages of careful seeding, in: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’07, 2007, pp. 1027–1035.

Dudoit, 2003, Bagging to improve the accuracy of a clustering procedure, Bioinformatics, 19, 1090, 10.1093/bioinformatics/btg038

Vega-Pons, 2011, A survey of clustering ensembles, Int. J. Pattern Recogn. Artif. Intell., 25, 337, 10.1142/S0218001411008683

Yang, 2014, Exploring the diversity in cluster ensemble generation: random sampling and random projection, Expert Syst. Appl., 41, 4844, 10.1016/j.eswa.2014.01.028

Yu, 2014, Probabilistic cluster structure ensemble, Inform. Sci., 267, 16, 10.1016/j.ins.2014.01.030

de Vries, 1986, Stratified random sampling, 31

Huang, 2005, Using auc and accuracy in evaluating learning algorithms, IEEE Trans. Knowl. Data Eng., 17, 290

Demšar, 2006, Statistical comparisons of classifiers over multiple data sets, J. Machine Learning Res., 7, 1

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA