Classifier and Exemplar Synthesis for Zero-Shot Learning

Springer Science and Business Media LLC - 2019

Soravit Changpinyo¹, Wei-Lun Chao², Boqing Gong³, Fei Sha⁴

¹Google AI, Los Angeles, USA

²Department of Computer Science, Cornell University, Ithaca, USA

³Google, Seattle, USA

⁴Department of Computer Science, University of Southern California, Los Angeles, USA

Tóm tắt

Zero-shot learning (ZSL) enables solving a task without the need to see its examples. In this paper, we propose two ZSL frameworks that learn to synthesize parameters for novel unseen classes. First, we propose to cast the problem of ZSL as learning manifold embeddings from graphs composed of object classes, leading to a flexible approach that synthesizes “classifiers” for the unseen classes. Then, we define an auxiliary task of synthesizing “exemplars” for the unseen classes to be used as an automatic denoising mechanism for any existing ZSL approaches or as an effective ZSL model by itself. On five visual recognition benchmark datasets, we demonstrate the superior performances of our proposed frameworks in various scenarios of both conventional and generalized ZSL. Finally, we provide valuable insights through a series of empirical analyses, among which are a comparison of semantic representations on the full ImageNet benchmark as well as a comparison of metrics used in generalized ZSL. Our code and data are publicly available at https://github.com/pujols/Zero-shot-learning-journal.

Từ khóa

Tài liệu tham khảo

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I. J., Harp, A., Irving, G., Isard, M., Jia, Y., Józefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D. G., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P. A., Vanhoucke, V., Vasudevan, V., Viégas, F. B., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., & Zheng, X. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. In: OSDI. Akata, Z., Perronnin, F., Harchaoui, Z., & Schmid, C. (2013) . Label-embedding for attribute-based classification. In: CVPR. Akata, Z., Reed, S., Walter, D., Lee, H., & Schiele, B. (2015) . Evaluation of output embeddings for fine-grained image classification. In: CVPR. Al-Halah, Z., & Stiefelhagen, R. (2015) . How to transfer? zero-shot object recognition via hierarchical transfer of semantic attributes. In: WACV. Argyriou, A., Evgeniou, T., & Pontil, M. (2008). Convex multi-task feature learning. Machine Learning, 73, 243–272. Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6), 1373–1396. Bucher, M., Herbin, S., & Jurie, F. (2018) . Zero-shot classification by generating artificial visual features. In: RFIAP. Changpinyo, S., Chao, W.-L., Gong, B., & Sha, F. (2016) . Synthesized classifiers for zero-shot learning. In CVPR. Changpinyo, S., Chao, W.-L., & Sha, F. (2017) . Predicting visual exemplars of unseen classes for zero-shot learning. In ICCV. Chao, W.-L., Changpinyo, S., Gong, B., & Sha, F. (2016). An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In ECCV. Chen, C.-Y., & Grauman, K. (2014). Inferring analogous attributes. In CVPR. Crammer, K., & Singer, Y. (2002). On the algorithmic implementation of multiclass kernel-based vector machines. JMLR, 2, 265–292. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In CVPR. Duan, K., Parikh, D., Crandall, D., & Grauman, K. (2012) . Discovering localized attributes for fine-grained recognition. In CVPR. Elhoseiny, M., Saleh, B., & Elgammal, A. (2013) . Write a classifier: Zero-shot learning using purely textual descriptions. In ICCV. Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D. (2009). Describing objects by their attributes. In CVPR. Frome, A., Corrado, G. S., Shlens, J., Bengio, S., Dean, J., Ranzato, M. A., & Mikolov, T. (2013) . Devise: A deep visual-semantic embedding model. In NIPS. Fu, Y., Hospedales, T. M., Xiang, T., Fu, Z., & Gong, S. (2014). Transductive multi-view embedding for zero-shot recognition and annotation. In ECCV. Fu, Y., Hospedales, T. M., Xiang, T., & Gong, S. (2015) . Transductive multi-view zero-shot learning. TPAMI. Fu, Y., Xiang, T., Jiang, Y.-G., Xue, X., Sigal, L., & Gong, S. (2018). Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content. IEEE Signal Processing Magazine, 35, 112–125. Gan, C., Lin, M., Yang, Y., Zhuang, Y., & Hauptmann, A. G. (2015) . Exploring semantic interclass relationships (sir) for zero-shot action recognition. In AAAI. Gan, C., Yang, T., & Gong, B. (2016). Learning attributes equals multi-source domain generalization. In CVPR. Garcia, S., & Herrera, F. (2008) . An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. JMLR, 9:2677–2694. Gavves, E., Mensink, T., Tommasi, T., Snoek, C. G., & Tuytelaars, T. (2015). Active transfer learning with zero-shot priors: Reusing past datasets for future tasks. In ICCV. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR. Hinton, G. E., & Roweis, S. T. (2002) . Stochastic neighbor embedding. In NIPS. Jayaraman, D., & Grauman, K. (2014) . Zero-shot recognition with unreliable attributes. In NIPS. Jayaraman, D., Sha, F., & Grauman, K. (2014). Decorrelating semantic visual attributes by resisting the urge to share. In CVPR. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014) . Caffe: Convolutional architecture for fast feature embedding. In ACM Multimedia. Kampffmeyer, M., Chen, Y., Liang, X., Wang, H., Zhang, Y., & Xing, E. P. (2019). Rethinking knowledge graph propagation for zero-shot learning. In CVPR. Karessli, N., Akata, Z., Bulling, A., & Schiele, B. (2017) . Gaze embeddings for zero-shot image classification. In CVPR. Kipf, T. N., Welling, M. (2017) . Semi-supervised classification with graph convolutional networks. In ICLR. Kodirov, E., Xiang, T., Fu, Z., & Gong, S. (2015). Unsupervised domain adaptation for zero-shot learning. In: ICCV. Kodirov, E., Xiang, T., & Gong, S. (2017). Semantic autoencoder for zero-shot learning. In CVPR. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012) . Imagenet classification with deep convolutional neural networks. In NIPS. Kumar Verma, V., Arora, G., Mishra, A., & Rai, P. (2018). Generalized zero-shot learning via synthesized examples. In CVPR. Lampert, C. H., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In CVPR. Lampert, C. H., Nickisch, H., & Harmeling, S. (2014). Attribute-based classification for zero-shot visual object categorization. TPAMI, 36(3), 453–465. Lei Ba, J., Swersky, K., Fidler, S., & Salakhutdinov, R. (2015). Predicting deep zero-shot convolutional neural networks using textual descriptions. In ICCV. Li, X., Guo, Y., & Schuurmans, D. (2015). Semi-supervised zero-shot classification with label representation learning. In ICCV. Long, Y., Liu, L., Shao, L., Shen, F., Ding, G., & Han, J. (2017). From zero-shot learning to conventional supervised classification: Unseen visual data synthesis. In CVPR. Lu, Y. (2016). Unsupervised learning of neural network outputs. In IJCAI. Mansimov, E., Parisotto, E., Ba, J. L., & Salakhutdinov, R. (2016). Generating images from captions with attention. In ICLR. Mensink, T., Gavves, E., & Snoek, C. G. (2014). COSTA: Co-occurrence statistics for zero-shot classification. In CVPR. Mensink, T., Verbeek, J., Perronnin, F., & Csurka, G. (2013). Distance-based image classification: Generalizing to new classes at near-zero cost. TPAMI, 35(11), 2624–2637. Mikolov, T., Chen, K., Corrado, G. S., & Dean, J. (2013a). Efficient estimation of word representations in vector space. In ICLR Workshops. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In NIPS. Miller, G. A. (1995). Wordnet: a lexical database for english. Communications of the ACM, 38(11), 39–41. Morgado, P., & Vasconcelos, N. (2017). Semantically consistent regularization for zero-shot recognition. In CVPR. Norouzi, M., Mikolov, T., Bengio, S., Singer, Y., Shlens, J., Frome, A., Corrado, G. S., & Dean, J. (2014). Zero-shot learning by convex combination of semantic embeddings. In ICLR Workshops. Palatucci, M., Pomerleau, D., Hinton, G. E., & Mitchell, T. M. (2009). Zero-shot learning with semantic output codes. In NIPS. Parikh, D., & Grauman, K. (2011). Interactively building a discriminative vocabulary of nameable attributes. In CVPR. Patterson, G., Xu, C., Su, H., & Hays, J. (2014). The SUN Attribute Database: Beyond categories for deeper scene understanding. IJCV, 108(1–2), 59–81. Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In EMNLP. Rebuffi, S.-A., Kolesnikov, A., Sperl, G., & Lampert, C. H. (2017) . iCaRL: Incremental classifier and representation learning. In CVPR. Reed, S., Akata, Z., Lee, H., & Schiele, B. (2016a). Learning deep representations of fine-grained visual descriptions. In CVPR. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. (2016). Generative adversarial text to image synthesis. In ICML. Ristin, M., Guillaumin, M., Gall, J., & Van Gool, L. (2016). Incremental learning of random forests for large-scale image classification. TPAMI, 38(3), 490–503. Rohrbach, M., Stark, M., & Schiele, B. (2011). Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In CVPR. Rohrbach, M., Stark, M., Szarvas, G., Gurevych, I., & Schiele, B. (2010). What helps where–and why? semantic relatedness for knowledge transfer. In CVPR. Romera-Paredes, B., & Torr, P. H. S. (2015). An embarrassingly simple approach to zero-shot learning. In ICML. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet large scale visual recognition challenge. IJCV. Salakhutdinov, R., Torralba, A., & Tenenbaum, J. (2011). Learning to share visual appearance for multiclass object detection. In CVPR. Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press. Schölkopf, B., Smola, A. J., Williamson, R. C., & Bartlett, P. L. (2000). New support vector algorithms. Neural computation, 12(5), 1207–1245. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR. Socher, R., Ganjoo, M., Manning, C. D., & Ng, A. Y. (2013). Zero-shot learning through cross-modal transfer. In NIPS. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In CVPR. Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-sne. JMLR, 9(2579–2605), 85. Van Horn, G., & Perona, P. (2017). The devil is in the tails: Fine-grained classification in the wild. arXiv preprint arXiv:1709.01450. Verma, V. K., & Rai, P. (2017). A simple exponential family framework for zero-shot learning. In ECML/PKDD. Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology. Wang, Q., & Chen, K. (2017). Zero-shot visual recognition via bidirectional latent embedding. IJCV, 124, 356–383. Wang, X., Ye, Y., & Gupta, A. (2018). Zero-shot recognition via semantic embeddings and knowledge graphs. In CVPR. Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., & Schiele, B. (2016). Latent embeddings for zero-shot classification. In CVPR. Xian, Y., Lampert, C. H., Schiele, B., & Akata, Z. (2018a). Zero-shot learning - a comprehensive evaluation of the Good, the Bad and the Ugly. TPAMI. Xian, Y., Lorenz, T., Schiele, B., & Akata, Z. (2018b). Feature generating networks for zero-shot learning. In CVPR. Xian, Y., Schiele, B., & Akata, Z. (2017). Zero-shot learning - the Good, the Bad and the Ugly. In CVPR. Xiao, J., Hays, J., Ehinger, K., Oliva, A., & Torralba, A. (2010). SUN Database: Large-scale scene recognition from abbey to zoo. In CVPR. Xu, X., Hospedales, T., & Gong, S. (2015). Semantic embedding space for zero-shot action recognition. In ICIP. Yan, X., Yang, J., Sohn, K., & Lee, H. (2016). Attribute2Image: Conditional image generation from visual attributes. In ECCV. Yang, Y., Hospedales, T. M. (2015). A unified perspective on multi-domain and multi-task learning. In ICLR. Yu, F. X., Cao, L., Feris, R. S., Smith, J. R., & Chang, S.-F. (2013). Designing category-level attributes for discriminative visual recognition. In CVPR. Zhang, L., Xiang, T., Gong, S. (2017). Learning a deep embedding model for zero-shot learning. In CVPR. Zhang, Z., & Saligrama, V. (2015). Zero-shot learning via semantic similarity embedding. In ICCV. Zhang, Z., & Saligrama, V. (2016). Zero-shot learning via joint latent similarity embedding. In CVPR. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., & Torralba, A. (2018). Places: A 10 million image database for scene recognition. TPAMI, 40, 1452–1464. Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., & Oliva, A. (2014). Learning deep features for scene recognition using places database. In NIPS. Zhu, X., Anguelov, D., & Ramanan, D. (2014). Capturing long-tail distributions of object subcategories. In CVPR. Zhu, Y., Elhoseiny, M., Liu, B., Peng, X., & Elgammal, A. (2018). A generative adversarial approach for zero-shot learning from noisy texts. In CVPR.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA