Multiclass classification with potential function rules: Margin distribution and generalization
Tài liệu tham khảo
Aizerman, 1964, Theoretical fundations of the potential function method in pattern recognition learning, Automation and Remote Control, 25, 917
Aizerman, 1964, The probability problem of pattern recognition learning and the method of potential functions, Automation and Remote Control, 25, 1307
Aizerman, 1964, The method of potential functions for the problem of restoring the characteristic of a function converter from randomly observed points, Automation and Remote Control, 25, 1705
Aizerman, 1970, Extrapolative problems in automatic control and the method of potential functions, American Mathematical Society Translations, 87, 281, 10.1090/trans2/087/16
Anthony, 1992
Avi-Itzhak, 1996, Arbitrarily tight upper and lower bounds on the Bayesian probability of error, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18, 89, 10.1109/34.476017
Barnard, 2003, Matching words and pictures, Journal of Machine Learning Research, 3, 1107
A. Barron, Complexity regularization with application to artificial neural networks, in: Nonparametric Functional Estimation and Related Topics, Kluwer Academic Publisher, 1991, pp. 561–576.
Bartlett, 1997, For valid generalization, the size of the weights is more important than the size of the network, Advances in Neural Information Processing Systems, 9, 134
Bashkirov, 1964, Potential function algorithms for pattern recognition learning machines, Automation and Remote Control, 25, 692
Bayes, 1763, An essay towards solving a problem in the doctrine of chances, The Philosophical Transactions, 53, 370
Ben-Bassat, 1980, Sensitivity analysis in Bayesian classification models: multiplicative deviations, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 261, 10.1109/TPAMI.1980.4767015
Berger, 1985
Bishop, 2006
Boulle, 2007, Compression-based averaging of selective naive Bayes classifiers, Journal of Machine Learning Research, 8, 1659
Bousquet, 2002, Stability and generalization, Journal of Machine Learning Research, 2, 499
Braverman, 1965, On the method of potential functions, Automation and Remote Control, 26, 2205
Braverman, 1966, Estimation of the rate of convergence of algorithms based on the potential functions method, Automation and Remote Control, 27, 95
Bruneau, 2010, Parsimonious reduction of Gaussian mixture models with a variational-Bayes approach, Pattern Recognition, 43, 850, 10.1016/j.patcog.2009.08.006
Chen, 2003, Support vector learning for fuzzy rule-based classification systems, IEEE Transactions on Fuzzy Systems, 11, 716, 10.1109/TFUZZ.2003.819843
Chen, 2006, MILES: multiple-instance learning via embedded instance selection, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1931, 10.1109/TPAMI.2006.248
Chen, 2009, Similarity-based classification: concepts and algorithms, Journal of Machine Learning Research, 10, 747
Davis, 1975, Mean square error properties of density estimates, The Annals of Statistics, 3, 1025, 10.1214/aos/1176343207
Davis, 1977, Mean integrated square error properties of density estimates, The Annals of Statistics, 5, 530, 10.1214/aos/1176343850
Devijver, 1974, On a new class of bounds on Bayes risk in multi-hypothesis pattern recognition, IEEE Transactions on Computers, 23, 70, 10.1109/T-C.1974.223779
Devroye, 1981, On the asymptotic probability of error in nonparametric discrimination, The Annals of Statistics, 9, 1320, 10.1214/aos/1176345648
Devroye, 1983, The equivalence of weak, strong and complete convergence in L1 for kernel density estimates, The Annals of Statics, 11, 896
Devroye, 1988, Asymptotic performance bounds for the kernel estimate, The Annals of Statistics, 16, 1162, 10.1214/aos/1176350953
Devroye, 1989, A universal lower bound for the kernel estimate, Statistics and Probability Letters, 8, 419, 10.1016/0167-7152(89)90021-7
Devroye, 1996
Devroye, 1998, The Hilbert kernel regression estimate, Journal of Multivariate Analysis, 65, 209, 10.1006/jmva.1997.1725
Devroye, 1999, On the Hilbert kernel density estimate, Statistics and Probability Letters, 44, 299, 10.1016/S0167-7152(99)00021-8
Devroye, 2002, New multivariate product density estimator, Journal of Multivariate Analysis, 82, 88, 10.1006/jmva.2001.2021
Domingos, 1997, On the optimality of the simple Bayesian classifier under zero–one loss, Machine Learning, 29, 103, 10.1023/A:1007413511361
Duda, 2001
Garg, 2003, Margin distribution and learning algorithms, 210
Gordon, 1978, Asymptotically efficient solutions to the classification problem, The Annals of Statistics, 6, 515, 10.1214/aos/1176344197
Griffiths, 1998
Guermeur, 2007, VC theory of large margin multi-category classifiers, Journal of Machine Learning Research, 8, 2551
Halevy, 2009, The unreasonable effectiveness of data, IEEE Intelligent Systems, 24, 8, 10.1109/MIS.2009.36
Hashlamoun, 1994, A tight upper bound on the Bayesian probability of error, IEEE Transactions on Pattern Analysis and Machine Intelligence, 16, 220, 10.1109/34.273728
Hastie, 2001
Hofmann, 1999, Unsupervised learning from dyadic data, Advances in Neural Information Processing Systems, 11, 466
Jin, 2010, Regularized margin-based conditional log-likelihood loss for prototype learning, Pattern Recognition, 43, 428, 10.1016/j.patcog.2010.01.013
Kearns, 1994
Kim, 2006, Bayesian Gaussian process classification with the EM–EP algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1948, 10.1109/TPAMI.2006.238
Kim, 2010, Large margin cost-sensitive learning of conditional random fields, Pattern Recognition, 43, 3683, 10.1016/j.patcog.2010.05.013
Langford, 2002, PAC-Bayes and margins, Advances in Neural Information Processing Systems, 15, 439
Langley, 1992, An analysis of Bayesian classifiers, 223
Langseth, 2009, Latent classification models for binary data, Pattern Recognition, 42, 2724, 10.1016/j.patcog.2009.05.002
Lugosi, 1996, Concept learning using complexity regularization, IEEE Transactions on Information Theory, 42, 48, 10.1109/18.481777
Maurer, 2008, Learning similarity with operator-valued large-margin classifiers, Journal of Machine Learning Research, 9, 1049
Mitchell, 1997
Pekalska, 2001, A generalized kernel approach to dissimilarity-based classification, Journal of Machine Learning Research, 2, 175
Rätsch, 2005, Efficient margin maximizing with boosting, Journal of Machine Learning Research, 6, 2131
Rosset, 2004, Boosting as a regularized path to a maximum margin classifier, Journal of Machine Learning Research, 5, 941
Schapire, 1998, Boosting the margin: a new explanation for the effectiveness of voting methods, The Annals of Statistics, 26, 1651
Schölkopf, 2002
Shawe-Taylor, 2004
Stone, 1977, Consistent nonparametric regression, The Annals of Statistics, 5, 595, 10.1214/aos/1176343886
Sung, 2008, Latent-space variational Bayes, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 2236, 10.1109/TPAMI.2008.157
Tibshirani, 2007, Margin trees for high-dimensional classification, Journal of Machine Learning Research, 8, 637
Vapnik, 1971, On the uniform convergence of relative frequencies of events to their probabilities, Theory of Probabilities and Its Applications, 16, 264, 10.1137/1116025
Vapnik, 1982
Vapnik, 1998
Veeramachaneni, 2007, Analytical results on style-constrained Bayesian classification of pattern fields, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1280, 10.1109/TPAMI.2007.1030
Wang, 2007, Large margin semi-supervised learning, Journal of Machine Learning Research, 8, 1867
Watson, 1963, On the estimation of the probability density I, The Annals of Mathematical Statistics, 34, 480, 10.1214/aoms/1177704159
Weinberger, 2009, Distance metric learning for large margin nearest neighbor classification, Journal of Machine Learning Research, 10, 207