Unbiased split selection for classification trees based on the Gini Index
Tài liệu tham khảo
Benjamini, 1995, Controlling the false discovery rate: a practical and powerful approach to multiple testing, J. Roy. Statist. Soc. B, 57, 289
Bittencourt, 2004, Feature selection by using classification and regression trees, 66
Boulesteix, 2006, Maximally selected chi-square statistics and binary splits of nominal variables, Biometrical J., 48, 838, 10.1002/bimj.200510191
Boulesteix, 2006, Maximally selected chi-square statistics for ordinal variables, Biometrical J., 48, 451, 10.1002/bimj.200510161
Boulesteix, 2006, Identification of interaction patterns and classification with applications to microarray data, Comput. Statist. Data Anal., 50, 783, 10.1016/j.csda.2004.10.004
Breiman, 2001, Random forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324
Breiman, 1984
Dobra, 2001, Bias correction in classification tree construction, 90
Dong, 1999, Efficient mining of emerging patterns: discovering trends and differences, 43
Evans, 1993
Hawkins, D.M., 1997. Firm: formal inference-based recursive modeling. Release 2.1, Technical Report 546, School of Statistics, University of Minnesota, MN, USA.
Hothorn, 2006, Unbiased recursive partitioning: a conditional inference framework, J. Comput. Graph. Statist., 15, 651, 10.1198/106186006X133933
Jong, 2005, Estimating neuronal variable importance with random forest, 33
Kass, 1980, An exploratory technique for investigating large quantities of categorical data, Appl. Statist., 29, 119, 10.2307/2986296
Kim, 2001, Classification trees with unbiased multiway splits, J. Amer. Statist. Assoc., 96, 589, 10.1198/016214501753168271
Kononenko, 1995, On biases in estimating multi-valued attributes, 1034
Koziol, 1991, On maximally selected chi-square statistics, Biometrics, 47, 1557, 10.2307/2532406
Little, 1986
Little, 2002
Liu, 1997, Techniques for dealing with missing values in classification, 527
Loh, 2002, Regression trees with unbiased variable selection and interaction detection, Statist. Sinica, 12, 361
Loh, 1997, Split selection methods for classification trees, Statist. Sinica, 7, 815
Miller, 1982, Maximally selected Chi square statistics, Biometrics, 38, 1011, 10.2307/2529881
Quinlan, 1986, Induction of decision trees, Mach. Learn., 1, 81, 10.1007/BF00116251
Quinlan, 1993
R Development Core Team, 2006. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, 〈http://www.R-project.org〉.
Schmaußer, M., 2005. Auswirkungen verschiedener Stoffwechsellagen auf die Fertilität beim Milchrind unter besonderer Berücksichtigung der individuellen Futteraufnahme und unter Berücksichtigung verschiedener Melksysteme. Ph.D. Thesis, Faculty of Veterinary Medicine, University of Munich LMU, Munich, Germany.
Shih, 2004, A note on split selection bias in classification trees, Comput. Statist. Data Anal., 45, 457, 10.1016/S0167-9473(03)00064-1
Shih, 2004, Variable selection bias in regression trees with constant fits, Comput. Statist. Data Anal., 45, 595, 10.1016/S0167-9473(03)00036-7
Strobl, C., 2005. Variable selection in classification trees based on imprecise probabilities. In: Cozman, F., Nau, R., Seidenfeld, T. (Eds.), Proceedings of the Fourth International Symposium on Imprecise Probabilities and their Applications, Carnegy Mellon University, Pittsburgh, PA, USA, SIPTA, Manno, pp. 340–348.
Strobl, C., Boulesteix, A.-L., Zeileis, A., Hothorn, T., 2006. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, accepted for publication.
White, 1994, Bias in information based measures in decision tree induction, Mach. Learn., 15, 321, 10.1007/BF00993349