Shell-neighbor method and its application in missing data imputation

Springer Science and Business Media LLC - Tập 35 Số 1 - Trang 123-133 - 2011
Shichao Zhang1
1Department of Computer Science, Zhejiang Normal University, Jinhua, China

Tóm tắt

Từ khóa


Tài liệu tham khảo

Batista G, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17(5–6):519–533

Berthold MR, Huber KP (1998) Missing values and learning of fuzzy rules. Int J Uncertain, Fuzziness Knowl-Based Syst 6(2):171–178

Chen J, Shao J (2001) Jackknife variance estimation for nearest-neighbor imputation. J Am Stat Assoc 96:260–269

Dempster AP, Laird NM, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc, Ser B 39:1–38

Farhangfar A, et al (2007) A novel framework for imputation of missing values in databases. IEEE Trans Syst Man Cybern Part A: Syst Humans 37(5):692–709

Gabrys B (2002) Neuro-fuzzy approach to processing inputs with missing values in pattern recognition problems. Int J Approx Reason 30(3):149–179

Gabrys B, Petrakieva L (2004) Combining labelled and unlabelled data in the design of pattern classification systems. Int J Approx Reason 35(3):251–273

Ghahramani Z, Jordan M (1994) Supervised learning from incomplete data via an EM approach. Adv Neural Inf Process Syst 6:120–127

Graham J, Cumsille P, Elek-Fisk E (2003) Methods for handling missing data. In: Handbook of psychology, vol 2. Wiley, New York, pp 87–114

Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. Morgan Kaufmann, San Mateo

Kang SS, Koehler K, Larsen MD (2007) Partial FEFI for incomplete tables with covariates. Iowa State University Press, Ames

Kothari R, Jain V (2002) Learning from labeled and unlabeled data. In: Proceedings of the 2002 international joint conference on neural networks, vol 3, pp 2803–2808

Lin D (1998) An information-theoretic definition of similarity. In: ICML-98, pp 296–304

Little R, Rubin D (2002) Statistical analysis with missing data. Wiley, New York, 2002

Mitchell T (1999) The role of unlabeled data in supervised learning. In: Proceedings of the sixth international colloquium on cognitive science

Nauck D, Kruse R (1999) Learning in neuro-fuzzy systems with symbolic attributes and missing values. In: Proceedings of the international conference on neural information processing (ICONIP’99), Perth, pp 142–147

Nijman MJ, Kappen HJ (1997) Symmetry breaking and training from incomplete data with radial basis Boltzmann machines. Int J Neural Syst 8(3):301–315

Peng C, Zhu J (2008) Comparison of two approaches for handling missing covariates in logistic regression. Educ Psychol Meas 68(1):58–77

Qin YS et al (2007) Semi-parametric optimization for missing data imputation. Appl Intell 27(1):79–88

Quinlan J (1993) C4.5: Programs for machine learning. Morgan Kaufmann, San Mateo

Rubin D, et al (1976) Inference and missing data. Biometrika 63(3):581–592

Schafer J (1997) Analysis of incomplete multivariate data. Chapman & Hall, London

Schafer J, Graham J (2002) Missing data: Our view of the state of the art. Psychol Methods 7(2):147–177

Tresp V, Ahmad S, Neuneier R (1994) Training neural networks with deficient data. Adv Neural Inf Process Syst 6:128–135

Zhang CQ et al (2007) GBKII: an imputation method for missing values. PAKDD-07, 2007, pp 1080–1087

Zhang SC (2008) Parimputation: from imputation and null-imputation to partially imputation. IEEE Intell Inf Bull 9(1): 32–38

Zhang SC, Qin ZX, Sheng SL, Ling CL (2005) “Missing is useful”: missing values in cost-sensitive decision trees. IEEE Trans Knowl Data Eng 17(12):1689–1693

Zhang SC et al (2008) Missing value imputation based on data clustering. Trans Comput Sci J 1:128–138

Zhang SC, Zhang CQ, Yang Q (2004) Information enhancement for data mining. IEEE Intell Syst 19:12–13