Deep Semisupervised Zero-Shot Learning with Maximum Mean Discrepancy
Tóm tắt
Due to the difficulty of collecting labeled images for hundreds of thousands of visual categories, zero-shot learning, where unseen categories do not have any labeled images in training stage, has attracted more attention. In the past, many studies focused on transferring knowledge from seen to unseen categories by projecting all category labels into a semantic space. However, the label embeddings could not adequately express the semantics of categories. Furthermore, the common semantics of seen and unseen instances cannot be captured accurately because the distribution of these instances may be quite different. For these issues, we propose a novel deep semisupervised method by jointly considering the heterogeneity gap between different modalities and the correlation among unimodal instances. This method replaces the original labels with the corresponding textual descriptions to better capture the category semantics. This method also overcomes the problem of distribution difference by minimizing the maximum mean discrepancy between seen and unseen instance distributions. Extensive experimental results on two benchmark data sets, CU200-Birds and Oxford Flowers-102, indicate that our method achieves significant improvements over previous methods.
Từ khóa
Tài liệu tham khảo
Collobert R., 2011, BigLearn, NIPS Workshop
Dauphin Y. N., 2013, Zero-shot learning for semantic utterance classification
Frome A., 2013, Advances in neural information processing systems, 26, 2121
Glorot X., 2011, Proceedings of the International Conference on Artificial Intelligence and Statistics, 315
Gretton A., 2012, Journal of Machine Learning Research, 13, 723
Kansky K., 2017, Schema networks: Zero-shot transfer with a generative causal model of intuitive physics
Klein B., 2014, Fisher vectors derived from hybrid gaussian-Laplacian mixture models for image annotation
Lei Ba J., 2015, Proceedings of the International Conference on Computer Vision, 4247
Nilsback M. E., 2008, Proceedings of the Conference on Computer Vision, Graphics and Image Processing, 722
Norouzi M., 2013, Zero-shot learning by convex combination of semantic embeddings
Palatucci M., 2009, Advances in neural information processing systems, 27, 1410
Parikh D., 2011, Proceedings of the International Conference on Computer Vision, 503
Peng K.-C., 2017, Zero-shot deep domain adaptation
Scheirer W. J., 2012, Proceedings Computer Vision and Pattern Recognition, 2933
Shojaee S. M., 2016, Semi-supervised zero-shot learning by a clustering-based approach
Simonyan K., 2014, Very deep convolutional networks for large-scale image recognition
Socher R., 2013, Advances in neural information processing systems, 26, 935
Welinder P., 2010, Caltech-UCSD birds 200
Xian Y., 2017, Zero-shot learning—the good, the bad and the ugly
Zhang X., 2015, Deep transfer network: Unsupervised domain adaptation