An algebraic characterization of the optimum of regularized kernel methods
Tóm tắt
Từ khóa
Tài liệu tham khảo
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csáki (Eds.), Second international symposium on information theory. Budapest: Académiai Kiadó.
Chapelle, O. (2007). Training a support vector machine in the primal. Neural Computation, 19(5), 1155–1178.
Chu, W., Keerthi, S. S., & Ong, C. J. (2001). A unified loss function in Bayesian framework for support vector regression. In Proceedings of the 18th international conference on machine learning (pp. 51–58). San Francisco: Morgan Kaufmann.
Cox, D., & O’Sullivan, F. (1990). Asymptotic analysis of penalized likelihood and related estimators. The Annals of Statistics, 18, 1676–1695.
Craven, P., & Wahba, G. (1979). Smoothing noisy data with spline functions. Numerische Mathematik, 31, 377–403.
De Vito, E., Rosasco, L., Caponnetto, A., Piana, M., & Verri, A. (2004). Some properties of regularized kernel methods. Journal of Machine Learning Research, 5, 1363–1390.
Dekel, O., Shalev-Shwartz, S., & Singer, Y. (2005). Smooth ε-insensitive regression by loss symmetrization. Journal of Machine Learning Research, 6, 711–741.
Dinuzzo, F., Neve, M., De Nicolao, G., & Gianazza, U. P. (2007). On the representer theorem and equivalent degrees of freedom of SVR. Journal of Machine Learning Research, 8, 2467–2495.
Efron, B. (2004). The estimation of prediction error: covariance penalties and cross-validation. Journal of the American Statistical Association, 99(14), 619–632.
Fukushima, M., & Qi, L. (1999). Reformulation: nonsmooth, piecewise smooth, semismooth and smoothing methods. Dordrecht: Kluwer Academic.
Gunter, L., & Zhu, J. (2007). Efficient computation and model selection for the support vector regression. Neural Computation, 19, 1633–1655.
Hastie, T. J., Tibshirani, R. J., & Friedman, J. (2001). The elements of statistical learning. Data mining, inference and prediction. Canada: Springer.
Kimeldorf, G., & Wahba, G. (1971). Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications, 33(1), 82–95.
Lee, Y. J., & Mangasarian, O. L. (2001). SSVM: A smooth support vector machine for classification. Computational Optimization and Applications, 20, 5–22.
Mallows, C. (1973). Some comments on C p . Technometrics, 15, 661–675.
Nesterov, Y. (2005). Smooth minimization of non-smooth functions. Mathematical Programming, 103, 127–152.
Osborne, M. R. (2001). Simplicial algorithms for minimizing polyhedral functions. Cambridge: Cambridge University Press.
Perez-Cruz, F., Bousono-Calzon, C., & Artes-Rodriguez, A. (2005). Convergence of the IRWLS procedure to the support vector machine solution. Neural Computation, 17(1), 7–18.
Platt, J. (1998). Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. Burges, & A. Smola (Eds.), Advances in kernel methods—support vector learning. Cambridge: MIT Press.
Poggio, T., & Girosi, F. (1992). A theory of networks for approximation and learning. In Foundation of neural networks (pp. 91–106).
Pontil, M., & Verri, A. (1998). Properties of support vector machines. Neural Computation, 10, 955–974.
Schölkopf, B., Herbrich, R., & Smola, A. J. (2001). A generalized representer theorem. Neural Networks and Computational Learning Theory, 81, 416–426.
Schölkopf, B., & Smola, A. J. (2001). Learning with kernels: support vector machines, regularization, optimization, and beyond. Adaptive computation and machine learning. Cambridge: MIT Press.
Shalev-Shwartz, S., Singer, Y., & Srebro, N. (2007). PEGASOS: primal estimated sub-gradient solver for svm. In ICML ’07: proceedings of the 24th international conference on machine learning (pp. 807–814). New York: ACM.
Stein, C. (1981). Estimation of the mean of a multivariate normal distribution. The Annals of Statistics, 9, 1135–1151.
Steinwart, I. (2003). Sparseness of support vector machines. Journal of Machine Learning Research, 4, 1071–1105.
Wahba, G. (1998). Support vector machines, reproducing kernel Hilbert spaces and randomized GACV (Tech. Rep. 984). Department of Statistics, University of Wisconsin.
Ye, J. (1998). On measuring and correcting the effects of data mining and model selection. Journal of the American Statistical Association, 93, 120–131.