Tổng quan về ứng dụng của biểu diễn thưa có cấu trúc trong chú thích hình ảnh

Artificial Intelligence Review - Tập 48 - Trang 331-348 - 2016
Vafa Maihami1, Farzin Yaghmaee1
1Department of Electrical and Computer Engineering, Semnan University, Semnan, Iran

Tóm tắt

Số lượng hình ảnh ngày càng tăng trên Web và các môi trường thông tin khác, cần quản lý hiệu quả và tìm kiếm phù hợp, đặc biệt là bởi máy tính. Chú thích hình ảnh là một quá trình tạo ra từ cho hình ảnh kỹ thuật số dựa trên nội dung của nó. Người dùng ưa thích tìm kiếm hình ảnh dựa trên truy vấn văn bản và từ khóa, điều này đã làm tăng việc sử dụng chú thích hình ảnh. Trong bài báo này, chúng tôi thảo luận về khả năng áp dụng của các biểu diễn thưa có cấu trúc trong chú thích hình ảnh. Đầu tiên, các thành phần của chú thích hình ảnh và biểu diễn thưa được xem xét. Sau đó, chúng tôi khảo sát cấu trúc của biểu diễn thưa dựa trên các thuật toán chú thích hình ảnh. Tiếp theo, việc so sánh các thuật toán đã được trình bày. Cuối cùng, bài báo kết luận với một số thách thức chính và các vấn đề mở trong chú thích hình ảnh sử dụng các biểu diễn thưa có cấu trúc.

Từ khóa

#chú thích hình ảnh #biểu diễn thưa có cấu trúc #thuật toán tìm kiếm hình ảnh #quản lý hình ảnh #thách thức trong chú thích hình ảnh

Tài liệu tham khảo

Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn J 73:243–272 Bach F (2008) Consistency of the group LASSO and multiple kernel learning. J Mach Learn Res 9:1179–1225 Bach F (2009) Parse methods for machine learning. In: 23rd annual conference on neural information processing systems (NIPS), Willow project , INRIA—Ecole Normale Superieure NIPS Tutorial, Vancouver, December Bach F, Jenatton R, Mairal J, Obozinski G (2011) Convex optimization with sparsity-inducing norms. . In: Sra S, Nowozin S, Wright SJ (eds) Optimization for machine learning. MIT Press, Cambridge Barnard K, Duygulu P, Forsyth D, De Freitas N, Blei D, Jordan M (2003) Matching words and pictures. J Mach Learn Res 3:1107–1135 Bruckstein A, Donoho D, Elad M (2009) From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Rev 51(1):3481 Candes E, Wakin M (2008) An introduction to compressive sampling. IEEE Signal Process Mag 25(2):21–30 Chaira T, Ray AK (2005) Fuzzy measures for color image retrieval. Fuzzy Sets Syst 150(3):545–560 Chen S, Donoho D, Saunders M (1999) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61 Chen Z, Chi Z, Hong F, Feng D (2013) Multi-instance multi-label image classification: a neural approach. Neurocomputing 99:298–306 Chen Z, Fu H, Chi Z, Feng D (2010) A neural network model with adaptive structure for image annotation. In: 11th international conference control, automation, robotics and vision, Singapore, 7–10th December 2010 Cheng H, Lu Z, Yang L, Chen X (2012) Sparse representation and learning in visual recognition: theory and applications. Sig Process 93(6):1408–1425 Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of singapore. In: ACM international conference on image and video retrieval, 2009 Dai D, Yang W (2011) Satellite image classification via two-layer sparse coding with biased image representation. IEEE Geosci Remote Sens Lett 8(1):173–176 Dasiopoulou S, Doulaverakis C, Mezaris V, Kompatsiaris I, Strintzis MG (2007) An ontology-based framework for semantic image analysis and retrieval. In: Zhang Y-J (ed) Semantic-based visual information retrieval. Idea Group Inc, Canada Datta R, Joshi D, Li J, Wang J (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv (CSUR) 40(2):5 Del Frate F, Pacifici F, Schiavon G, Solimini C (2007) Use of neural networks for automatic classification from high-resolution images. IEEE Trans Geosci Remote Sens 45(4):800–809 Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: IEEE computer vision and pattern recognition (CVPR), pp 248–255 Deselaers T, Mller H, Clough P, Ney H, Lehmann TM (2007) The CLEF 2005 automatic medical image annotation task. Int J Comput Vis 74(1):51–58 Dong W, Zhang L, Shi G (2011) Centralized sparse representation for image restoration. In: ICCV, 2011 Donoho D, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization. Proc Nat Acad Sci 100(5):2197 Duygulu P, Barnard K, De Freitas J, Forsyth D (2002) Object recognition as machine translation:learning a lexicon for a fixed image vocabulary. In: Proceedings of European conferenceon computer vision (ECCV), vol 2353, pp 97–112 Duygulu P, Barnard K, De Freitas J, Forsyth D (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of ECCV, 2002, pp 97–112 Eldar YC, Kutyniok G (2012) Compressed sensing: theory and applications. Cambridge University Press, Cambridge Elhamifar E, Vidal R (2011) Robust classification using structured sparse representation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1873–1879 Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338 Figueiredo M, Nowak R, Wright S (2007) Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J Sel Top Signal Process 1(4):586597 Gao S, Chia L-T, Tsang I-H (2011) Multi-layer group sparse coding—for concurrent image classification and annotation. In: CVPR, 2011 Goh K-S, Chang EY, Li B (2005) Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans Knowl Data Eng 17(10):1333–1346 Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: IEEE 12th international conference on computer vision, pp 309–316 Han M, Zhu X, Yao W (2012a) Remote sensing image classification based on neural network ensemble algorithm. Neurocomputing 78(1):133–138 Han Y, Wu F, Tian Q, Zhuang Y (2012b) Image annotation by input-output structural grouping sparsity. IEEE Trans Image Process 21(6):3066–3079 He SL, Chen H, Carin L (2010) Tree-structured compressive sensing with variational Bayesian analysis. IEEE Signal Process Lett 17(3):233–236 Hua Zh, Wang X, Liu Q, Lu H (2005) Semantic knowledge extraction and annotation for web images. In: Proceedings of the 13th annual ACM international conference on Multimedia, Hilton, Singapore, November 06–11, 2005 Huang J-B, Yang M-H (2010) Fast sparse representation with proto-types. In: CVPR, 2010 Huang J, Huang X, Metaxas D (2009) Learning with dynamic group sparsity. In: Proceedings of international conference computer vision, 2009, pp 64–71 Huang J, Zhang T (2009) The benefit of group sparsity. Technical report, Rutgers University Huang J, Zhang T (2010) The benefit of group sparsity. Ann Stat 38(4):19782004 Huang J, Zhang T, Metaxas D (2009) Learning with structured sparsity. In: ICML, 2009 Huiskes M, Thomee B, Lew M (2010) New trends and ideas in visual concept detection: the MIR Flickr retrieval evaluation initiative. In: Proceedings of ACM MIR, 2010 Jeong S, Won CS, Gray RM (2004) Image retrieval using color histograms generated by Gauss mixture vector quantization. Comput Vis Image Underst 94(13):4466 Jia K, Wang X, Tang X (2011) Optical flow estimation using learned sparse model. In: ICCV, 2011 Kalpathy-Cramer J, de Herrera AGS, Demner-Fushman D, Antani S, Bedrick S, Mller H (2015) Evaluating performance of biomedical image retrieval systems—an overview of the medical image retrieval task at ImageCLEF 2004–2013. Comput Med Imag Graphics 39:55–61 Khajehnejad MA, Xu W, Avestimehr AS, Hassibi B (2009) Weighted L1 minimization for sparse recovery with prior information. In: Proceedings of international symposium on information theory, 2009 Li L-J, Zhu J, Su H, Xing EP, Fei-Fei L (2013) Multi-level structured image coding on high-dimensional imagerepresentation. In: Computer vision ACCV 2012. Springer, Berlin, pp 147–161 Liu Y, Zhang D, Lu G, Ma W (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recognit 40(1):262282 Liu Y, Zhang D, Lu G (2008) Region-based image retrieval with high-level semantics using decision tree learning. Pattern Recogn 41(8):728–741 Liu X, Cheng B, Yan S, Tang J, Chua TS, Jin H (2009) Label to region by bi-layer sparsity priors. In: MM09: proceedings of the seventeen ACM international conference on multimedia. ACM, New York, pp 115–124 Long F, Zhang H, Feng DD (2003) Fundamentals of content-based image retrieval. In: Feng DD, Siu WC, Zhang HJ (eds) Multimedia information retrieval and management: technological fundamentals and applications. Springer, Berlin, pp 1–26 Lotfi A, maihami V, Yaghmaee F (2014) Wood image annotation using gabor texture feature. Int J Mechatron Electr Comput Technol 4(13):1508–1523 Loui A, Luo J, Chang S-F, Ellis D, Jiang W, Kennedy L, Lee K, Yanagawa A (2007) Kodak’s consumer video benchmark data set: concept definition and annotation. In: Proceedings of the international workshop on Workshop on multimedia information retrieval. ACM, pp 245–254 Mairal J, Yu B (2013) Supervised feature selection in graphs with path coding penalties and network flows. J Mach Learn Res 14(1):2449–2485 Mller H, Deselaers T, Deserno T, Clough P, Kim E, Hersh W (2007) A overview of the ImageCLEFmed 2006medical retrieval and medical annotation tasks. In: Evaluation of multilingual and multi-modal information retrieval. Springer, Berlin, pp 595–608 Park SB, Lee JW, Kim SK (2004) Content-based image classification using a neural network. Pattern Recogn Lett 25:287–300 Qi X, Han Y (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recogn 40(2):728–741 Rigamonti R, Brown M, Lepetit V (2011) Are sparse representations really relevant for image classification? In: CVPR, 2011 Russell BC, Torralba A, Murphy KP, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vis 77(1–3):157–173 Shah B, Benton R, Wu Z, Raghavan V (2006) Automatic and semi-automatic annotation techniques for image. In: Zhang CY-J (ed) Semantic-based visual information retrieval. Idea Group Publishing, Hershey Shin Y, Kim Y, Kim EY (2010) Automatic textile image annotation by predicting emotional concepts from visual features. Image Vis Comput 28(3):526–537 Shotton J, Winn J, Rother C, Criminisi A (2006) Textonboost: joint appearance,shape and context modeling for mulit-class object recognition and segmentation. In: European conference on computer vision Soltani-Farani A, Rabiee HR (2015) When pixels team up: spatially-weighted sparse coding for hyperspectral image classification. IEEE Geosci Remote Sens Lett 12(1):107–111 Stojnic M, Parvaresh F, Hassibi B (2009) On the reconstruction of block-sparse signals with an optimal number of measurements. IEEE Trans Signal Process 57(8):3075–3085 Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc 58:267–288 Tropp J (2004) Greed is good: algorithmic results for sparse approximation. IEEE Trans Inf Theory 50(10):2231–2242 Wagner A, Wright J, Ganesh A, Zhou Z, Mobahi H, Ma Y (2012) Toward a practical face recognition system: robust alignment and illumination by sparse representation. IEEE Trans Pattern Anal Mach Intell 34(2):372–386 Wang F (2011) A survey on automatic image annotation and trends of the new age. Proc Eng 23:434438 Wang M, Li F, Wang M (2012) Collaborative visual modeling for automatic image annotation via sparse model coding. Neurocomputing 95:22–28 Wang G, Zhang S, Xie H, Metaxas D, Lixu G (2015) A homotopy-based sparse representation for fast and accurate shape prior modeling in liver surgical planning. Med Image Anal 19(1):176–186 Wang C, Yan S, Zhang L, Zhang H-J (2009) Multi-label sparse coding for automatic image annotation. In: Proceedings of IEEE international conference computer vision and pattern recognition, Florida, USA, pp 1643–1650 Wong RCF, Leung CHC (2008) Automatic semantic annotation of real-world web images. IEEE PAMI 30(11):1933–1944 Wright J, Yang A, Ganesh A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227 Xu Y, Zhang D, Yang J, Yang J-Y (2011) A two-phase test sample sparse representation method for use with face recognition. IEEE Trans Circuits Syst Video Technol 21(9):1255–1262 Xu Y, Zuo W, Fan Z (2012) Supervised sparse representation method with a heuristic strategy and face recognition experiments. Neurocomputing 79:125–131 Xu Y, Zhu Q, Fan Z, Zhang D, Mi J, Lai Z (2013) Using the idea of the sparse representation to perform coarse to fine face recognition. Inf Sci 238:138–148 Xu W, Khajehnejad A, Avestimehr S, Hassibi B (2010) Breakingthrough the thresholds: an analysis for iterative reweighted L1 minimization via the Grassmann angle framework. In: Proceedings of international conference on acoustics, speech and signal processing (ICASSP), 2010 Yang Y, Huang Z, Yang Y, Liu J, Shen HT, Luo J (2012) Local image tagging via graph regularized joint group sparsity. Pattern Recogn 46(5):1358–1368 Yuan Y, Fei W, Shao J, Zhuang Y (2013) Image annotation by semi-supervised cross-domain learning with group sparsity. J Vis Commun Image Represent 24(2):95–102 Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B 68(1):49–67 Zha Z et al (2008) Joint multi-label multi-instance learning for image classification. In: Proceedings of the CVPR, 2008 Zhang D, Islam MM, Lu G (2012) A review on automatic image annotation techniques. Pattern Recognit 45:346–362 Zhang S, Huang J, Li H, Metaxas DN (2012) Automatic image annotation and retrieval using group sparsity. IEEE Trans Syst Man Cybern Part B Cybern 42(3):838–849 Zhang H, Li J, Huang Y, Zhang L (2014) A nonlocal weighted joint sparse representation classification method for hyperspectral imagery. IEEE J Sel Top Appl Earth Obs Remote Sens 7(6):2056–2065 Zhang S, Huang J, Huang Y, Yu Y, Li H, Metaxas D (2010) Automatic image annotation using group sparsity. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3312–3319 Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: which helps face recognition? In: Proceedings of the ICCV, 2011 Zhao Z-Q, Glotin H, Xie Z, Gao J, Wu X (2012) Cooperative sparse representation in two opposite directions for semi-supervised image annotation. IEEE Trans Image Process 21(9):4218–4231 Zhao Y, Zhao Y, Zhu Z, Pan J-S (2008) A novel image annotation scheme based on neural network. In: Eighth international conference on intelligent systems design and applications, 2008 Zheng Y, Gee J (2010) Estimation of image bias field with sparsity constraints. In: CVPR, 2010 Zhou T, Tao D, Wu X (2011) Manifold elastic net: a unified framework for sparse dimension reduction. Data Min Knowl Disc 22(3):340–371