Semi-supervised classification with privileged information

Zhiquan Qi1, Yingjie Tian1, Lingfeng Niu1, Bo Wang1
1Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing, China

Tóm tắt

The privileged information that is available only for the training examples and not available for test examples, is a new concept proposed by Vapnik and Vashist (Neural Netw 22(5–6):544–557, 2009). With the help of the privileged information, learning using privileged information (LUPI) (Neural Netw 22(5–6):544–557, 2009) can significantly accelerate the speed of learning. However, LUPI is a standard supervised learning method. In fact, in many real-world problems, there are also a lot of unlabeled data. This drives us to solve problems under a semi-supervised learning framework. In this paper, we propose a semi-supervised learning using privileged information (called Semi-LUPI), which can exploit both the distribution information in unlabeled data and privileged information to improve the efficiency of the learning. Furthermore, we also compare the relative importance of both types of information for the learning model. All experiments verify the effectiveness of the proposed method, and simultaneously show that Semi-LUPI can obtain superior performances over traditional supervised and semi-supervised methods.

Tài liệu tham khảo

Vapnik V, Vashist A (2009) A new learning paradigm: learning using privileged information. Neural Netw 22(5–6):544–557 Vapnik V (1995) The nature of statistical learning theory. Springer, New York Vapnik V (1996) The nature of statistical learning theory. Springer, New York Vapnik V (2006) Estimation of dependences based on empirical data (information science and statistics). Springer, Berlin Pechyony D, Vapnik V (2010) On the theory of learning with privileged information. In: Advances in neural information processing systems, vol 23 Pechyony D, Izmailov R, Vashist A, Vapnik V (2010) Smo-style algorithms for learning using privileged information. In: DMIN. CSREA Press, Providence, pp 235–241 Seeger M (2001) Learning with labeled and unlabeled data. Technical report Chapelle O, Schölkopf B, Zien A (eds) (2006) Semi-supervised learning (adaptive computation and machine learning). The MIT Press, Cambridge Zhu X (2006) Semi-supervised learning literature survey. Technical Report 15304, University of Wisconsin, Madison Belkin M, Matveeva I, Niyogi P (2004) Regularization and semi-supervised learning on large graphs. In: COLT. Springer, Berlin, pp 624–638 Grandvalet Y, Bengio Y (2005) Semi-supervised learning by entropy minimization. In: CAP, PUG, pp 281–296 Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434 Joachims T (2003) Transductive learning via spectral graph partitioning. In: ICML, pp 290–297 Belkin M, Niyogi P (2002) Using manifold structure for partially labelled classification. In: NIPS, pp 953–960 Zhu X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using gaussian fields and harmonic functions. In: ICML, pp 912–919 Deng N, Tian Y, Zhang C (2011) Optimization based data mining: theory and applications. Springer Press, Berlin Tian Y, Yong S, Xiaohui L (2012) Recent advances on support vector machines research. Technol Econ Dev Econ 18(1): 5–33 Mackey MC, Glass L (1977) Oscillation and chaos in physiological control systems. Science 197(4300):287–289 Everingham M, Zisserman A, Williams CKI, Van Gool L (2006) The PASCAL visual object classes challenge 2006 (VOC 2006) results. http://www.pascal-network.org/challenges/VOC/voc2006/results.pdf Deng Y, Manjunath BS, Kenney C, Moore MS, Member S, Shin H (2001) An efficient color representation for image retrieval. IEEE Trans Image Process 10:140–147