Supervised classification with conditional Gaussian networks: Increasing the structure complexity from naive Bayes
Tài liệu tham khảo
Anderson, 1958
S.G. Bottcher, Learning Bayesian networks with mixed variables, PhD thesis, Aalborg University, 2004.
Castillo, 1997
J. Cheng, R. Greiner, Comparing Bayesian network classifiers, in: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence, 1999, pp. 101–107.
Chow, 1968, Approximating discrete probability distributions with dependence trees, IEEE Transactions on Information Theory, 14, 462, 10.1109/TIT.1968.1054142
Cover, 1991
Cover, 1967, Nearest neighbour pattern classification, IEEE Transactions on Information Theory, 13, 21, 10.1109/TIT.1967.1053964
DeGroot, 1970
Domingos, 2000, A unified bias-variance decomposition and its applications, 231
Domingos, 1997, On the optimality of the simple Bayesian classifier under zero-one loss, Machine Learning, 29, 103, 10.1023/A:1007413511361
Duda, 1973
Dudewicz, 1988
M. Egmont-Peterson, Feature selection by Markov chain Monte Carlo sampling: a Bayesian approach, in: Proceedings of the Joint IAPR Workshops SSPR 2004 and SPR 2004, 2004, pp. 1034–1042.
U. Fayyad, K. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, in: Proceedings of the 13th International Conference on Artificial Intelligence, 1993, pp. 1022–1027.
Fisher, 1936, The use of multiple measurements, Annals of Eugenics, 7, 179, 10.1111/j.1469-1809.1936.tb02137.x
Friedman, 1997, On bias, variance, 0/1 - loss, and the curse-of-dimensionality, Data Mining and Knowledge Discovery, 1, 55, 10.1023/A:1009778005914
Friedman, 1997, Bayesian network classifiers, Machine Learning, 29, 131, 10.1023/A:1007465528199
N. Friedman, M. Goldszmidt, T. Lee, Bayesian network classification with continuous attributes: getting the best of both discretization and parametric fitting, in: Proceedings of the 15th National Conference on Machine Learning, 1998.
D. Geiger, D. Heckerman, Learning Gaussian networks, Technical Report, Microsoft Research, Advanced Technology Division, 1994.
Geiger, 1996, Beyond Bayesian networks: similarity networks and Bayesian multinets, Artificial Intelligence, 82, 45, 10.1016/0004-3702(95)00014-3
German, 1992, Neural networks and the bias-variance dilemma, Neural Computation, 4, 1, 10.1162/neco.1992.4.1.1
Giudici, 1999, Decomposable graphical Gaussian model determination, Biometrika, 86, 785, 10.1093/biomet/86.4.785
Goldberg, 1989
D. Grossman, P. Domingos, Learning Bayesian network classifiers by maximizing conditional likelihood, in: Proceeding of the 21th International Conference on Machine Learning, 2004.
Guyon, 2003, An introduction to variable and feature selection, Journal of Machine Learning Research, 3, 1157
M.A. Hall, L.A. Smith, Feature subset selection: a correlation based filter approach, in: Proceeding of the Fourth International Conference on Neural Information Processing and Intelligent Information Systems, 1997, pp. 855–858.
Heckerman, 1995, Learning Bayesian networks: the combination of knowledge and statistical data, Machine Learning, 20, 197, 10.1007/BF00994016
James, 2003, Variance and bias for general loss functions, Machine Learning, 51, 115, 10.1023/A:1022899518027
T. Jebara, Discriminative, generative, and imitative learning, PhD thesis, Massachusetts Institute of Technology, 2001.
G. John, P. Langley, Estimating continuous distributions in Bayesian classifiers, in: Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, 1995, pp. 338–345.
Johnson, 2002
Jolliffe, 1986
E.J. Keogh, M. Pazzani, Learning augmented Bayesian classifiers: a comparison of distribution-based and non distribution-based approaches, in: Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics, 1999, pp. 225–230.
R. Kohavi, Wrappers for performance enhancement and oblivious decision graphs, PhD Thesis, Computer Science department, 1995.
Kohavi, 1997, Wrappers for feature subset selection, Artificial Intelligence, 97, 273, 10.1016/S0004-3702(97)00043-X
R. Kohavi, D.H. Wolpert, Bias plus variance decomposition for zero-one loss functions, in: International Conference on Machine Learning, 1996.
I. Kononenko, Semi-naive Bayesian classifiers, in: Proceedings of the 6th European Working Session on Learning, 1991, pp. 206–219.
Kudo, 2000, Comparison of algorithms that select features for pattern classifiers, Machine Learning, 33, 25
P. Langley, W. Iba, K. Thompson, An analysis of Bayesian classifiers, in: Proceedings of the 10th National Conference on Artificial Intelligence, 1992, pp. 223–228.
P. Langley, S. Sage, Induction of selective Bayesian classifiers, in: Proceedings of the 10th Conference on Uncertainty in Artificial Intelligence, 1994, pp. 399–406.
Larrañaga, 2002
Lauritzen, 1996
S.L. Lauritzen, N. Wermuth, Mixed interaction models. Technical Report r 84-8, Institute for Electronic Systems, Aalborg University, 1984.
Lauritzen, 1989, Graphical models for associations between variables, some of which are qualitative and some quantitative, Annals of Statistics, 17
Liu, 1998
Minsky, 1961, Steps toward artificial intelligence, Transactions on Institute of Radio Engineers, 49, 8
P.M. Murphy, D.W. Aha, UCI repository of machine learning databases, Technical Report, University of California at Irvine, 1995. Available from: <http://www.ics.uci.edu/~mlearn>.
Neapolitan, 2003
M. Pazzani, Searching for dependencies in Bayesian classifiers, in: Learning from Data: Artificial Intelligence and Statistics V, 1997, pp. 239–248.
Pearl, 1988
Pernkopf, 2005, Bayesian network classifier versus k-NN classifier, Pattern Recognition, 38, 1, 10.1016/j.patcog.2004.05.012
F. Pernkopf, J. Bilmes, Discriminative versus generative parameter and structure learning of Bayesian network classifiers, in: Proceedings of the 22nd International Conference in Machine Learning, 2005.
Quinlan, 1986, Induction of decision trees, Machine Learning, 1, 81, 10.1007/BF00116251
Quinlan, 1993
R. Raina, Y. Shen, A.Y. Ng, A. McCallum, Classification with hybrid generative/discriminative models, in: Advances in Neural Information Processing Systems 16, 2003.
Rosenblatt, 1959
M. Sahami, Learning limited dependence Bayesian classifiers, in: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, 1996, pp. 335–338.
G. Santaf́e, J.A. Lozano, P. Larrañaga, Discriminative learning of Bayesian network classifiers via the TM algorithm, in: Proceedings of the Eighth European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 2005, pp. 148–160.
van der Putten, 2004, A bias-variance analysis of a real world learning problem: the CoIL challenge 2000, Machine Learning, 57, 177, 10.1023/B:MACH.0000035476.95130.99
H. Wang, Towards a unified framework of relevance, PhD Thesis, Faculty of Informatics, University of Ulster, 1996.
Wang, 1999, Axiomatic approach to feature subset selection based on relevance, IEEE Transactions on Pattern Analysis and Machine Intelligence, 21, 271, 10.1109/34.754624
Witten, 2005
Y. Yang, G.I. Webb, Discretization for naive-Bayes learning: managing discretization bias and variance, Technical Report 2003-131, School of Computer Science and Software Engineering, Monash University, 2003.
Yu, 2004, Efficient feature selection via analysis of relevance and redundancy, Machine Learning Research, 5, 1205