Neuropsychological predictors of conversion from mild cognitive impairment to Alzheimer’s disease: a feature selection ensemble combining stability and predictability

BMC Medical Informatics and Decision Making - Tập 18 - Trang 1-20 - 2018
Telma Pereira1,2, Francisco L. Ferreira2, Sandra Cardoso3, Dina Silva4, Alexandre de Mendonça3, Manuela Guerreiro3, Sara C. Madeira1
1LASIGE, Faculdade de Ciências, Universidade de Lisboa, Lisbon, Portugal
2Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
3Laboratório de Neurociências, Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
4Cognitive Neuroscience Research Group, Department of Psychology and Educational Sciences and Centre for Biomedical Research (CBMR), University of Algarve, Faro, Portugal

Tóm tắt

Predicting progression from Mild Cognitive Impairment (MCI) to Alzheimer’s Disease (AD) is an utmost open issue in AD-related research. Neuropsychological assessment has proven to be useful in identifying MCI patients who are likely to convert to dementia. However, the large battery of neuropsychological tests (NPTs) performed in clinical practice and the limited number of training examples are challenge to machine learning when learning prognostic models. In this context, it is paramount to pursue approaches that effectively seek for reduced sets of relevant features. Subsets of NPTs from which prognostic models can be learnt should not only be good predictors, but also stable, promoting generalizable and explainable models. We propose a feature selection (FS) ensemble combining stability and predictability to choose the most relevant NPTs for prognostic prediction in AD. First, we combine the outcome of multiple (filter and embedded) FS methods. Then, we use a wrapper-based approach optimizing both stability and predictability to compute the number of selected features. We use two large prospective studies (ADNI and the Portuguese Cognitive Complaints Cohort, CCC) to evaluate the approach and assess the predictive value of a large number of NPTs. The best subsets of features include approximately 30 and 20 (from the original 79 and 40) features, for ADNI and CCC data, respectively, yielding stability above 0.89 and 0.95, and AUC above 0.87 and 0.82. Most NPTs learnt using the proposed feature selection ensemble have been identified in the literature as strong predictors of conversion from MCI to AD. The FS ensemble approach was able to 1) identify subsets of stable and relevant predictors from a consensus of multiple FS methods using baseline NPTs and 2) learn reliable prognostic models of conversion from MCI to AD using these subsets of features. The machine learning models learnt from these features outperformed the models trained without FS and achieved competitive results when compared to commonly used FS algorithms. Furthermore, the selected features are derived from a consensus of methods thus being more robust, while releasing users from choosing the most appropriate FS method to be used in their classification task.

Tài liệu tham khảo

Scheltens P, Blennow K, Breteler MMB, De SB, Frisoni GB, Salloway S, et al. Alzheimer’s disease. Lancet. 2016;388:505–17. Prince M, Wimo A, Guerchet M, Gemma-Claire A, Wu Y-T, Prina M. World Alzheimer report 2015: the global impact of dementia - an analysis of prevalence, incidence, cost and trends. London: Alzheimer’s Dis. Int; 2015. Alzheimer Association. 2016 Alzheimer’s Disease Facts and Figures. In: Alzheimer’s Dement. 2016, vol. 12; 2016. p. 1–80. Available from: http://www.alz.org/facts/overview.asp#quickFacts. Prince M, Comas-Herrera A, Knapp M, Guerchet M, Karagiannidou M. World Alzheimer Report 2016: Improving healthcare for people living with dementia. In: Alzheimer’s dis. Int; 2016. Petersen RC, Smith GE, Waring SC, Ivnik RI, Tangalos EG, Kokmen E. Mild cognitive impairment. Clinical Characterization and Outcome Arch Neurol. 1999;56:303–8. Battista P, Salvatore C, Castiglioni I. Optimizing neuropsychological assessments for cognitive, behavioral, and functional impairment classification: a machine learning study. Behav Neurol. 2017;2017. Ferreira FL, Cardoso S, Silva D, Guerreiro M, De Mendonça A, Madeira SC. Improving prognostic prediction in Alzheimer’s disease using genetic algorithms. In: Fdez-Riverola F, Mohamad MS, Rocha M, De Paz JF, Pinto T, editors. 11th Int. Conf. Pract. Appl. Comput. Biol. Bioinforma: Springer international publishing; 2017. Ye J, Farnum M, Yang E, Verbeeck R, Lobanov V, Raghavan N, et al. Sparse learning and stability selection for predicting MCI to AD conversion using baseline ADNI data. BMC Neurol BMC Neurology. 2012;12:1. Moradi E, Pepe A, Gaser C, Huttunen H, Tohka J. Machine learning framework for early MRI-based Alzheimer’s conversion prediction in MCI subjects. NeuroImage. 2014;104:398–412. Salvatore C, Castiglioni I. A wrapped multi-label classifier for the automatic diagnosis and prognosis of Alzheimer’s disease. J Neurosci Methods. 2018;302:58–65. Amoroso N, Diacono D, Fanizzi A, La Rocca M, Monaco A, Lombardi A, et al. Deep learning reveals Alzheimer’s disease onset in MCI subjects: results from an international challenge. J Neurosci Methods. 2017. Pereira T, Lemos L, Cardoso S, Silva D, Rodrigues A, Santana I, et al. Predicting progression of mild cognitive impairment to dementia using neuropsychological data: a supervised learning approach using time windows. BMC Med Inform Decis Mak BMC Medical Informatics and Decision Making. 2017;17:110. Dimitriadis SI, Liparas D, Tsolaki MN. Random forest feature selection, fusion and ensemble strategy: Combining multiple morphological MRI measures to discriminate among healhy elderly, MCI, cMCI and alzheimer’s disease patients: From the alzheimer’s disease neuroimaging initiative (ADNI) data. In: J Neurosci Methods Elsevier B.V; 2017. p. 1–10. Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference. and prediction Math Intell. 2001;27:83–5. Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157–82. Tang J, Alelyani S, Liu H. Feature selection for classification: a review. Data Classif. Algorithms Appl. CRC Press; 2014. p. 37–64. Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, et al. Feature selection: a data perspective. ACM Comput Surv. 2017;50:1–45. Yang P, Hwa Yang Y, Zhou BB, Zomaya YA. A review of ensemble methods in bioinformatics. Curr Bioinforma. 2010;5:296–308. Blum AL, Langley P. Selection of relevant features and examples in machine learning. Artif Intell. 1997;97:245–71. Langley P. Selection of relevant features in machine learning. In: Proc. AAAI fall Symp. Relev; 1994. p. 140–4. Meinshausen N, Buhlamann P. Stability selection. J. R. Stat. Soc. Ser B Statistical Methodol. 2010;72:417–73. Saeys Y, Abeel T, Van de Peer Y. Robust feature selection using ensemble feature selection techniques. In: ECML PKDD 2008. Berlin: Springer; 2008. p. 313–25. Bolón-canedo V, Sánchez-Maroño N, Alonso-betanzos A. Data classification using an ensemble of filters. Neurocomputing. 2014;135:13–20. Seijo-Pardo B, Porto-Díaz I, Bolón-Canedo V, Alonso-Betanzos A. Ensemble feature selection: homogeneous and heterogeneous approaches. Knowledge-Based Syst Elsevier BV. 2017;118:124–39. Zhou Z-H. Ensemble Methods: Foundations and algorithms. In: Chapman & CRC. 1st ed; 2012. Lustgarten JL, Gopalakrishnan V, Visweswaran S. Measuring stability of feature selection in biomedical datasets. AMIA Annu Symp Proc. 2009;2009:406–10. Schapire RE. Measures of diversity in classifier ensembles. Mach Learn. 2003;51:181–207. Kuncheva LI. A stability index for feature selection. Int Multi-conference Artif Intell Appl. 2007:390–5. Zhao G, Feature Subset WY. Selection for Cancer classification using weight local modularity. Sci Rep Nature Publishing Group. 2016;6:34759. Nogueira S, Brown G. Measuring the stability of feature selection with applications to ensemble methods. Mult Classif Syst. 2015:135–46. Ben A, Mohamed B. Ensemble feature selection for high dimensional data : a new method and a comparative study. Adv. Data anal. Classif. Berlin: Springer; 2017. Abeel T, Helleputte T, Van De PY, Dupont P, Saeys Y. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. 2010;26:392–8. Kalousis A, Prados J. Stability of feature selection algorithms : a study on high-dimensional spaces. Knowl Inf Syst. 2007;12:95–116. Dunne K, Cunningham P, Azuaje F. Solutions to instability problems with sequential wrapper-based approaches to feature selection. Dublin; 2002. Nogueira S, Brown G. Measuring the stability of feature selection. In: ECML PKDD 2016 Machine Learning Knowlegde Discovery Databases; 2016. p. 442–57. Vandewater L, Brusic V, Wilson W, Macaulay L, Zhang P. An adaptive genetic algorithm for selection of blood-based biomarkers for prediction of Alzheimer’s disease progression. BMC bioinformatics. BioMed Central Ltd. 2015;16:S1. Spedding AL, Di Fatta G, Cannataro M. A genetic algorithm for the selection of structural MRI features for classification of mild cognitive impairment and Alzheimer’s disease. IEEE Int Conf Bioinforma Biomed. 2015;2015:1566–71. Tohka J, Moradi E, Huttunen H. Comparison of feature selection techniques in machine learning for anatomical brain MRI in dementia. Neuroinformatics Neuroinformatics. 2016. Nanni L, Salvatore C, Cerasa A, Castiglioni I. Combining multiple approaches for the early diagnosis of Alzheimer’s disease. Pattern Recognit Lett Elsevier BV. 2016;84:259–66. Belleville S, Fouquet C, Hudon C, Zomahoun HTV, Croteau J. Neuropsychological measures that predict progression from mild cognitive impairment to Alzheimer’s type dementia in older adults: a systematic review and meta-analysis. Neuropsychol Rev Neuropsychology Review. 2017:1–26. Lee SJ, Ritchie CS, Yaffe K, Cenzer IS, Barnes DE. A clinical index to predict progression from mild cognitive impairment to dementia due to Alzheimer’s disease. PLoS One. 2014;9:e113535. Summers MJ, Saunders NLJ. Neuropsychological measures predict decline to Alzheimer’s dementia from mild cognitive impairment. Neuropsychology. 2012;26:498–508. Belleville S, Fouquet C, Duchesne S, Collins DL, Hudon C. Detecting early preclinical Alzheimer’s disease via cognition, neuropsychiatry, and neuroimaging: qualitative review and recommendations for testing. J Alzheimers Dis. 2014;42:S375–82. Barnes DE, Cenzer IS, Yaffe K, Ritchie CS, Lee SJ. A point-based tool to predict conversion from mild cognitive impairment to probable Alzheimer’s disease. Alzheimers Dement. 2014;10:646–55. Johnson P, Vandewater L, Wilson W, Maruff P, Savage G, Graham P, et al. Genetic algorithm with logistic regression for prediction of progression to Alzheimer’s disease. BMC Bioinformatics. 2014;15:S11. Carreiro AV, Mendonça A, de Carvalho M, Madeira SC. Integrative biomarker discovery in neurodegenerative diseases. Wiley Interdiscip Rev Syst Biol Med. 2015;7:357–79. Mueller SG, Weiner MW, Thal LJ, Petersen RC, Jack CR, Jagust W, et al. Ways toward an early diagnosis in Alzheimer’s disease: the Alzheimer’s Disease Neuroimaging Initiative (ADNI). Alzheimers Dement. 2005;1:55–66. Maroco J, Silva D, Rodrigues A, Guerreiro M, Santana I, De Mendonça A. Data mining methods in the prediction of dementia: a real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests. BMC Res Notes. 2011;4:229. Guerreiro M. Contributo da Neuropsicologia para o Estudo das Demências. In: Faculty of Medicine of Lisbon; 1998. American Psychiatric Association. DSM-iv-TR. 4th ed. Washington DC: APA; 2000. Portet F, Ousset P, Visser P, Frisoni G, Nobili F, Scheltens P, et al. Mild cognitive impairment (MCI) in medical practice: a critical review of the concept and new diagnostic procedure. Report of the MCI working Group of the European Consortium on Alzheimer’s disease. J Neurol Neurosurg Psychiatry. 2006;77:714–8. Alzheimer’s Association. 2018 Alzheimer’s Disease Facts and Figures. Alzheimer’s Dement. J. Alzheimer’s Assoc. 2018;14:367–429 Samtani NM, Raghavan N, Novak G, Nandini R, Narayan VA. Disease progression model for clinical dementia rating – sum of boxes in mild cognitive impairment and Alzheimer ’ s subjects from the Alzheimer ’ s disease neuroimaging initiative. Neuropsychiatr Dis Treat. 2014;10:929–52. Doyle OM, Westman E, Marquand AF, Mecocci P, Vellas B, Tsolaki M, et al. Predicting progression of Alzheimer’s disease using ordinal regression. PLoS One. 2014;9:e105542. Cabral C, Morgado PM, Campos Costa D, Silveira M. Predicting conversion from MCI to AD with FDG-PET brain images at different prodromal stages. Comput Biol Med. 2015;58:101–9. Eskildsen SF, Coupé P, García-Lorenzo D, Fonov V, Pruessner JC, Collins DL. Prediction of Alzheimer’s disease in subjects with mild cognitive impairment from the ADNI cohort using patterns of cortical thinning. NeuroImage. 2013;65:511–21. Kononenko I. Estimating attributes: analysis and extensions of RELIEF. In: Machine Learning ECML-94. Berlin Heidelberg: Springer; 1994. p. 171–82. Liu H, Setiono R. Chi2: feature selection and discretization of numeric attributes. Proc IEEE Int Conf Artif Intell. 1995:388–91. Guyon I, Weston J, Barnhill S. Gene selection for Cancer classification using support vector machines. Mach Learn. 2002;46:389–422. Liu J, Ji S, Ye J. Multi-task feature learning via efficient l2,1-norm minimization. Proc. twenty-fifth Conf. Uncertain. Artif Intell. 2009:339–48. Willett P. Combination of similarity rankings using data fusion. J Chem Inf Model. 2013;53:1–10. Seijo-pardo B, Bolón-canedo V, Alonso-betanzos A. Testing different ensemble configurations for feature selection. Neural process. Lett: Springer US; 2017. Powers DMW. Evaluation: from precision , recall and F-factor to ROC , Informedness , Markedness & Correlation. Australia: Adelaide; 2007. Wang H, Khoshgoftaar TM, Napolitano A. A comparative study of ensemble feature selection techniques for software defect prediction. Ninth Int Conf Mach Learn Appl. 2010. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE : synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–57. Demsar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res. 2006;7:1–30. Bastin C, Salmon E. Early neuropsychological detection of Alzheimer’s disease. Eur J Clin Nutr. 2014;68:1192–9 Available from: https://doi.org/10.1038/ejcn.2014.176. Silva D, Guerreiro M, Santana I, Rodrigues A, Cardoso S, Maroco J, et al. Prediction of long-term (5 years) conversion to dementia using neuropsychological tests in a memory clinic setting. J Alzheimers Dis. 2013;34:681–9. Amieva H, Jacqmin-Gadda H, Orgogozo J-M, Le Carret N, Helmer C, Letenneur L, et al. The 9 year cognitive decline before dementia of the Alzheimer type: a prospective population-based study. Brain. 2005;128:1093–101. Grober E, Lipton RB, Hall C, Crystal H. Memory impairment on free and cued selective reminding predicts dementia. Neurology. 2000;54:827–32. Irish M, Lawlor BA, Coen RF, O’Mara SM. Everyday episodic memory in amnestic mild cognitive impairment: a preliminary investigation. BMC Neurosci. 2011;12:80. Dickerson BC, R a S, Hyman BT, Albert MS, Blacker D. Clinical prediction of Alzheimer disease dementia across the spectrum of mild cognitive impairment. Arch Gen Psychiatry. 2007;64:1443–50.