Deep Semisupervised Zero-Shot Learning with Maximum Mean Discrepancy

Neural Computation - Tập 30 Số 5 - Trang 1426-1447 - 2018
Lingling Zhang1, Jun Liu1, Minnan Luo1, Xiaojun Chang2, Qinghua Zheng1
1MOEKLINNS Lab, Department of Computer Science and Technology, Xi'an Jiaotong University, 710049, China
2Centre for Quantum Computation and Intelligent Systems, University of Technology Sydney, Ultimo, NSW 2007, Australia

Tóm tắt

Due to the difficulty of collecting labeled images for hundreds of thousands of visual categories, zero-shot learning, where unseen categories do not have any labeled images in training stage, has attracted more attention. In the past, many studies focused on transferring knowledge from seen to unseen categories by projecting all category labels into a semantic space. However, the label embeddings could not adequately express the semantics of categories. Furthermore, the common semantics of seen and unseen instances cannot be captured accurately because the distribution of these instances may be quite different. For these issues, we propose a novel deep semisupervised method by jointly considering the heterogeneity gap between different modalities and the correlation among unimodal instances. This method replaces the original labels with the corresponding textual descriptions to better capture the category semantics. This method also overcomes the problem of distribution difference by minimizing the maximum mean discrepancy between seen and unseen instance distributions. Extensive experimental results on two benchmark data sets, CU200-Birds and Oxford Flowers-102, indicate that our method achieves significant improvements over previous methods.

Từ khóa


Tài liệu tham khảo

10.1007/s11263-008-0204-y

10.1145/2939672.2939812

Collobert R., 2011, BigLearn, NIPS Workshop

Dauphin Y. N., 2013, Zero-shot learning for semantic utterance classification

10.1109/ICCV.2013.321

Frome A., 2013, Advances in neural information processing systems, 26, 2121

10.1109/TPAMI.2015.2408354

10.1109/CVPR.2016.576

10.1109/CVPR.2015.7298879

Glorot X., 2011, Proceedings of the International Conference on Artificial Intelligence and Statistics, 315

Gretton A., 2012, Journal of Machine Learning Research, 13, 723

Kansky K., 2017, Schema networks: Zero-shot transfer with a generative causal model of intuitive physics

Klein B., 2014, Fisher vectors derived from hybrid gaussian-Laplacian mixture models for image annotation

10.1109/CVPR.2012.6248026

10.1109/ICCV.2011.6126395

10.1109/CVPR.2011.5995702

10.1109/TPAMI.2011.48

10.1109/CVPR.2009.5206594

10.1109/TPAMI.2013.140

10.1109/72.554195

Lei Ba J., 2015, Proceedings of the International Conference on Computer Vision, 4247

10.1016/j.neucom.2013.09.056

10.1109/CVPR.2013.59

Nilsback M. E., 2008, Proceedings of the Conference on Computer Vision, Graphics and Image Processing, 722

Norouzi M., 2013, Zero-shot learning by convex combination of semantic embeddings

Palatucci M., 2009, Advances in neural information processing systems, 27, 1410

Parikh D., 2011, Proceedings of the International Conference on Computer Vision, 503

Peng K.-C., 2017, Zero-shot deep domain adaptation

Scheirer W. J., 2012, Proceedings Computer Vision and Pattern Recognition, 2933

Shojaee S. M., 2016, Semi-supervised zero-shot learning by a clustering-based approach

Simonyan K., 2014, Very deep convolutional networks for large-scale image recognition

Socher R., 2013, Advances in neural information processing systems, 26, 935

10.1109/CVPR.2016.541

Welinder P., 2010, Caltech-UCSD birds 200

Xian Y., 2017, Zero-shot learning—the good, the bad and the ugly

10.1109/CVPR.2015.7298966

10.1007/978-3-642-15555-0_10

10.1137/1.9781611972757.75

Zhang X., 2015, Deep transfer network: Unsupervised domain adaptation

10.1109/ICCV.2015.474

10.1166/jctn.2016.5272