Image-level classification by hierarchical structure learning with visual and semantic similarities

Information Sciences - Tập 422 - Trang 271-281 - 2018
Chunjie Zhang1,2, Jian Cheng3,4,2, Qi Tian5
1Research Center for Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, 100190 Beijing, China
2University of Chinese Academy of Sciences, 100049 Beijing, China
3National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 100190 Beijing, China
4Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, China
5Department of Computer Sciences, University of Texas at San Antonio TX, 78249, U.S.A

Tài liệu tham khảo

Bo, 2013, Multipath sparse coding using hierarchical matching pursuit, 660 Boiman, 2008, In defense of nearest-neighbor based image classification, 1 K. Chatfield, K. Simonyan, A. Vedaldi, A. Zisserman, return of the devil in the details: delving deep into convolutional nets, 2014. arXiv:1405.3531. Cinbis, 2012, “image categorization using fisher kernels of non-iid image models”, 2184 J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, T. Darrell, “deCAF: a deep convolutional activation feature for generic visual recognition”, 2013. [cs.CV]. arXiv:1310.1531. M. Everingham, L. Gool, C. Williams, J. Winn, A. Zisserman, The PASCAL visual object classes challenge 2012 (VOC 2012) results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html. M. Everingham, A. Zisserman, C. Williams, L.V. Gool, The PASCAL visual object classes challenge 2007 (VOC 2007) results, technical report, pascal challenge, 2007. G. Griffin, A. Holub, P. Perona, Caltech-256 object category dataset, technical report, caltech, 2007. Gonzalez, 2016, “An investigation on the use of local multi-resolution patterns for image classification”, Inf. Sci., 361-362, 1, 10.1016/j.ins.2016.04.044 H. Harzallah, F. Jurie, C. Schmid, combining efficient object localization and image classification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2009.237C244. Hong, 2013, Image-based 3d human pose recovery with locality sensitive sparse retrieval, 2103 Hong, 2015, Multimodal deep autoencoder for human pose recovery, IEEE Trans. Image Process., 24, 5659, 10.1109/TIP.2015.2487860 Hong, 2015, Multi-view ensemble manifold regularization for 3d object recognition, Inf. Sci., 320, 395, 10.1016/j.ins.2015.03.032 Krizhevsky, 2012, Imagenet classification with deep convolutional neural networks, 1097 Li, 2010, Objectbank: a high-level image representation for scene classification & semantic feature sparsification, 1378 Ng, 2002, On spectral clustering: analysis and an algorithm, 849 Oquab, 2014, Learning and transferring mid-level image representations using convolutional neural networks, 1717 Sanchez, 2013, Image classification with the fisher vector: Theory and practice, Int. J. Comput. Vision, 105, 222, 10.1007/s11263-013-0636-x P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, Y.L. Cun, Overfeat: Integrated recognition, localization and detection using convolutional networks, 2013. arXiv:1312.6229. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014. [cs.CV]. arXiv:1409.1556. Sohn, 2011, Efficient learning of sparse, distributed, convolutional feature representations for object recognition, 2643 Tan, 2016, Weakly supervised metric learning for traffic sign recognition in a LIDAR-equipped vehicle, IEEE Trans. Intell. Transp. Syst., 17, 1415, 10.1109/TITS.2015.2506182 Torresani, 2010, Efficient object category recognition using classesmes, 776 Wang, 2010, Locality-constrained linear coding for image classification, 3360 Wang, 2014, Detecting human action as the spatio-temporal tube of maximum mutual information, IEEE Trans. Circuits Syst. Video Technol., 24, 277, 10.1109/TCSVT.2013.2276856 Wei, 2016, HCP: A flexible CNN framework for multi-label image classification, IEEE Trans. Pattern Anal. Mach. Intell., 38, 1901, 10.1109/TPAMI.2015.2491929 Xu, 2016, Multi-loss regularized deep neural network, IEEE Trans. Circuits Syst. Video Technol., 26, 2273, 10.1109/TCSVT.2015.2477937 Yang, 2016, Exploit bounding box annotations for multi-label object recognition, 280 Yang, 2009, Linear spatial pyramid matching using sparse coding for image classification, 1794 Yu, 2014, Click prediction for web image reranking using multimodal sparse coding, IEEE Trans. Image Process., 23, 2019, 10.1109/TIP.2014.2311377 Yu, 2014, Semantic preserving distance metric learning and applications, Inf. Sci., 281, 674, 10.1016/j.ins.2014.01.025 Yu, 2016, Deep multimodal distance metric learning using click constraints for image ranking, IEEE Trans. Cybern. Yuan, 2016, High-order local ternary patterns with locality preserving projection for smoke detection and image classification, Inf. Sci., 372, 225, 10.1016/j.ins.2016.08.040 Zeiler, 2013, Visualizing and understanding convolutional networks, CoRR Zepeda, 2015, Exemplar SVMs as visual feature encoders, 3052 Zhang, 2015, Beyond explicit codebook generation: visual representation using implicitly transferred codebooks, IEEE Trans. Image Process., 24, 5777, 10.1109/TIP.2015.2485783 Zhang, 2014, Object categorization in sub-semantic space, Neurocomputing, 142, 248, 10.1016/j.neucom.2014.03.059 Zhang, 2017, Incremental codebook adaptation for visual representation and categorization, IEEE Trans. Cybern. Zhang, 2017, Multi-view label sharing for visual representations and classifications, IEEE Trans. Multimed, Accepted. Zhang, 2017, Structured weak semantic space construction for visual categorization, IEEE Trans. Neural Network Learn. Syst. Zhang, 2015, Image classification using boosted local features with random orientation and location selection, Inf. Sci., 310, 118, 10.1016/j.ins.2015.03.011 Zhang, 2017, Contextual exemplar classifier based image representation for classification, IEEE Trans. Circuits Syst. Video Technol., 27, 1691, 10.1109/TCSVT.2016.2527380 Zhang, 2017, Fine-grained image classification via low-rank sparse coding with general and class-specific codebooks, IEEE Trans. Neural Network Learn. Syst., 28, 1550, 10.1109/TNNLS.2016.2545112 Zhang, 2014, Image classification by non-negative sparse coding, correlation constrained low-rank and sparse decomposition, Comput. Vision Image Understanding, 123, 14, 10.1016/j.cviu.2014.02.013 Zhang, 2011, Image classification by non-negative sparse coding, low-rank and sparse decomposition, 1673 Zhang, 2017, Bundled local features for image representation, IEEE Trans. Circuits Syst. Video Technol. Zhang, 2016, Boosted random contextual semantic space based representation for visual recognition, Inf. Sci., 369, 160, 10.1016/j.ins.2016.06.029 Zhang, 2017, Image classification by search with explicitly and implicitly semantic representations, Inf. Sci., 376, 125, 10.1016/j.ins.2016.10.019 Zhang, 2016, Image class prediction by joint object, context and background modeling, IEEE Trans. Circuits Syst. Video Technol. Zhang, 2015, Spatial-aware object-level saliency prediction by learning graphlet hierarchies, IEEE Trans. Indust. Electron., 62, 1301, 10.1109/TIE.2014.2336602