Comparison of five iterative imputation methods for multivariate classification

Chemometrics and Intelligent Laboratory Systems - Tập 120 - Trang 106-115 - 2013
Yushan Liu1, Steven D. Brown1
1Department of Chemistry and Biochemistry, University of Delaware, Brown Laboratory, 163 The Green, Newark, DE 19716, USA

Tài liệu tham khảo

Little, 2002 Schafer, 1997 Schneider, 2001, Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values, Journal of Climate, 14, 853, 10.1175/1520-0442(2001)014<0853:AOICDE>2.0.CO;2 Walczak, 2001, Dealing with missing data, part I, Chemometrics and Intelligent Laboratory Systems, 58, 15, 10.1016/S0169-7439(01)00131-9 Nelson, 1996, Missing data methods in PCA and PLS: score calculations with incomplete observations, Chemometrics and Intelligent Laboratory Systems, 35, 45, 10.1016/S0169-7439(96)00007-X Lopez-Negrete, 2010, An efficient nonlinear programming strategy for PCA models with incomplete data sets, Journal of Chemometrics, 24, 301 Grung, 1998, Missing values in principal component analysis, Chemometrics and Intelligent Laboratory Systems, 42, 125, 10.1016/S0169-7439(98)00031-8 Camacho, 2012, Cross-validation in PCA models with the element-wise k-fold (ekf) algorithm: theoretical aspects, Journal of Chemometrics, 26, 361, 10.1002/cem.2440 Arteaga, 2002, Dealing with missing data in MSPC: several methods, different interpretations, some examples, Journal of Chemometrics, 16, 408, 10.1002/cem.750 Smolinski, 2002, Exploratory analysis of data sets with missing elements and outliers, Chemosphere, 49, 233, 10.1016/S0045-6535(02)00326-0 Arteaga, 2005, Framework for regression-based missing data imputation methods in on-line MSPC, Journal of Chemometrics, 19, 439, 10.1002/cem.946 Stanimirova, 2007, Dealing with missing values and outliers in principal component analysis, Analytica Chimica Acta, 581, 324, 10.1016/j.aca.2006.08.014 Bello, 1995, Imputation techniques in regression analysis: looking closely at their implementation, Computational Statistics & Data Analysis, 20, 45, 10.1016/0167-9473(94)00024-D Bello, 1993, Choosing among imputation techniques for incomplete multivariate data — a simulation study, Communications in Statistics A Theory, 22, 853, 10.1080/03610929308831061 Branden, 2009, Robust data imputation, Computational Biology and Chemistry, 33, 7, 10.1016/j.compbiolchem.2008.07.019 Oh, 2011, Biological impact of missing-value imputation on downstream analyses of gene expression profiles, Bioinformatics, 27, 78, 10.1093/bioinformatics/btq613 Williams, 2007, On classification with incomplete data, IEEE Transactions on Pattern Analysis, 29, 427, 10.1109/TPAMI.2007.52 Farhangfar, 2008, Impact of imputation of missing values on classification error for discrete data, Pattern Recognition, 41, 3692, 10.1016/j.patcog.2008.05.019 Ibrahim, 2005, Missing-data methods for generalized linear models, Journal of the American Statistical Association, 100, 332, 10.1198/016214504000001844 Azur, 2011, Multiple imputation by chained equations: what is it and how does it work?, International Journal of Methods in Psychiatric Research, 20, 40, 10.1002/mpr.329 Wang, 2003 Dear, 1959 Chan, 1972, Treatment of missing values in discriminant analysis. 1. Sampling experiment, Journal of the American Statistical Association, 67, 473 Krzanowski, 1988, Missing value imputation in multivariate data using the singular value decomposition of a matrix, Biometrical Letters, 25, 31 Krzanowski, 1987, Cross-validation in principal component analysis, Biometrics, 43, 575, 10.2307/2531996 White, 2011, Multiple imputation using chained equations: issues and guidance for practice, Statistics in Medicine, 30, 377, 10.1002/sim.4067 Lee, 2010, Multiple imputation for missing data: fully conditional specification versus multivariate normal imputation, American Journal of Epidemiology, 171, 624, 10.1093/aje/kwp425 Dempster, 1977, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B Methods, 39, 1 Fierro, 1997, Regularization by truncated total least squares, SIAM Journal on Scientific Computing, 18, 1223, 10.1137/S1064827594263837 Varmuza, 2009 Duda, 2000 Bensmail, 1996, Regularized Gaussian discriminant analysis through eigenvalue decomposition, Journal of the American Statistical Association, 91, 1743, 10.1080/01621459.1996.10476746 Banfield, 1993, Model-based Gaussian and non-Gaussian clustering, Biometrics, 49, 803, 10.2307/2532201 Flury, 1994, Error rates in quadratic discrimination with constraints on the covariance matrices, Journal of Classification, 11, 101, 10.1007/BF01201025 Leray, 1998, Feature selection with neural networks, Behaviormetrika, 26, 145, 10.2333/bhmk.26.145 El Ouardighi, 2007, Feature selection on supervised classification using Wilks lambda statistic, 51 Arteaga, 2010, How to simulate normal data sets with the desired correlation structure, Chemometrics and Intelligent Laboratory Systems, 101, 38, 10.1016/j.chemolab.2009.12.003