Universal consistency of twin support vector machines

International Journal of Machine Learning and Cybernetics - Tập 12 Số 7 - Trang 1867-1877 - 2021
Weiya Xu1, Daren Huang2, Shuigeng Zhou3
1School of Information Management, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China
2School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
3School of Computer Science, and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai 200433, China

Tóm tắt

AbstractA classification problem aims at constructing a best classifier with the smallest risk. When the sample size approaches infinity, the learning algorithms for a classification problem are characterized by an asymptotical property, i.e., universal consistency. It plays a crucial role in measuring the construction of classification rules. A universal consistent algorithm ensures that the larger the sample size of the algorithm is, the more accurately the distribution of the samples could be reconstructed. Support vector machines (SVMs) are regarded as one of the most important models in binary classification problems. How to effectively extend SVMs to twin support vector machines (TWSVMs) so as to improve performance of classification has gained increasing interest in many research areas recently. Many variants for TWSVMs have been proposed and used in practice. Thus in this paper, we focus on the universal consistency of TWSVMs in a binary classification setting. We first give a general framework for TWSVM classifiers that unifies most of the variants of TWSVMs for binary classification problems. Based on it, we then investigate the universal consistency of TWSVMs. To do this, we give some useful definitions of risk, Bayes risk and universal consistency for TWSVMs. Theoretical results indicate that universal consistency is valid for various TWSVM classifiers under some certain conditions, including covering number, localized covering number and stability. For applications of our general framework, several variants of TWSVMs are considered.

Từ khóa


Tài liệu tham khảo

Berner J, Grohs P, Jentzen A (2020) Analysis of the generalization error: empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of black-scholes partial differential equations. SIAM J Math Data Sci 2(3):631–657

Bloom WR, Elliott D (1981) The modulus of continuity of the remainder in the approximation of Lipschitz functions. J Approx Theory 31(1):59–66

Bousquet O, Elisseeff A (2000) Algorithmic stability and generalization performance. In: Leen TK, Dietterich TG, Tresp V (eds) Advances in neural information processing systems, vol 13. MIT Press, Denver, pp 196–202

Brownlees CT, Joly E, Lugosi G (2015) Empirical risk minimization for heavy-tailed losses. Ann Stat 43(6):2507–2536

Chen DR, Sun T (2006) Consistency of multiclass empirical risk minimization methods based on convex loss. J Mach Learn Res 7(11):2435–2447

Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition, stochastic modelling and applied probability, vol 31. Springer, Berlin

Dumpert F, Andreas C (2018) Universal consistency and robustness of localized support vector machines. Neurocomputing 315:96–106

Fathony R, Behpour S, Zhang X, Ziebart BD (2018) Efficient and consistent adversarial bipartite matching. In: Dy JG, Krause A (eds) Proceedings of the 35th international conference on machine learning, ICML 2018, vol 80. PMLR, Stockholm, pp 1457–1466

Fiasche M (2014) SVM tree for personalized transductive learning in bioinformatics classification problems. In: Bassis S, Esposito A, Morabito FC (eds) Recent advances of neural network models and applications, WIRN 2013, vol 26. Springer, Salerno, pp 223–231

Fisher RA (1992) On the mathematical foundations of theoretical statistics. In: Kotz S, Johnson NL (eds) Breakthroughs in statistics. Springer series in statistics (perspectives in statistics). Springer, Berlin, pp 11–44

Gupta D, Richhariya B, Borah P (2019) A fuzzy twin support vector machine based on information entropy for class imbalance learning. Neural Comput Appl 31(11):7153–7164

Györfi L, Weiss R (2020) Universal consistency and rates of convergence of multiclass prototype algorithms in metric spaces. arXiv:2010.00636

Khemchandani RJ, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910

Kumar MA, Gopal M (2008) Application of smoothing technique on twin support vector machines. Pattern Recognit Lett 29(13):1842–1848

Kumar MA, Gopal M (2010) A comparison study on multiple binary-class svm methods for unilabel text categorization. Pattern Recognit Lett 31(11):1437–1444

Liu X, Wan A (2015) Universal consistency of extreme learning machine for RBFNs case. Neurocomputing 168:1132–1137

Liu Y (2007) Fisher consistency of multicategory support vector machines. In: Meila M, Shen X(eds) Proceedings of the eleventh international conference on artificial intelligence and statistics, AISTATS 2007, vol 2, San Juan, Puerto Rico, pp 291–298

Mangasarian OL, Wild EW (2006) Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74

Qi Z, Tian Y, Shi Y (2013) Robust twin support vector machine for pattern classification. Pattern Recognit 46(1):305–316

Shao Y, Deng N (2013) A novel margin-based twin support vector machine with unity norm hyperplanes. Neural Comput Appl 22(7):1627–1635

Shao Y, Zhang C, Wang X, Deng N (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968

Shao YH, Chen WJ, Wang Z, Li CN, Deng NY (2015) Weighted linear loss twin support vector machine for large-scale classification. Knowl Based Syst 73:276–288

Singh G, Chhabra I (2018) Effective and fast face recognition system using complementary OC-LBP and HOG feature descriptors with SVM classifier. J Inf Technol Res 91–110

Steinwart I (2001) On the influence of the kernel on the consistency of support vector machines. J Mach Learn Res 2(1):67–93

Steinwart I (2005) Consistency of support vector machines and other regularized kernel classifiers. IEEE Trans Inf Theory 51(1):128–142

Vapnik VN (1991) Principles of risk minimization for learning theory. In: Moody JE, Hanson SJ, Lippmann R (eds) Advances in neural information processing systems, vol 4. Morgan Kaufmann, Denver, pp 831–838

Vapnik VN, Chervonenkis AY (1971) On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab Appl 16(2):264–280

Vapnik VN, Chervonenkis AY (1982) Necessary and sufficient conditions for the uniform convergence of the means to their expectations. Theory Probab Appl 26(3):532–553

Vapnik VN, Chervonenkis AY (1991) The necessary and sufficient conditions for consistency in the empirical risk minimization method. Pattern Recognit Image Anal 1(3):283–305

Xu W, Huang D, Zhou S (2019) Statistical learning with group invariance: problem, method and consistency. Int J Mach Learn Cybern 10:1503–1511

Xu Y, Xi W, Lv X, Guo R (2012) An improved least squares twin support vector machine. J Inf Comput Sci 9(4):1063–1071

Yan H, Ye Q, Zhang T, Yu DJ, Yuan X, Xu Y, Fu L (2018) Least squares twin bounded support vector machines based on L1-norm distance metric for classification. Pattern Recognit 74:434–447

Zhou D (2002) The covering number in learning theory. J Complexity 18(3):739–767