Detection of artificial and scene text in images and video frames

Pattern Analysis and Applications - Tập 16 - Trang 431-446 - 2011

Marios Anthimopoulos¹, Basilis Gatos¹, Ioannis Pratikakis²

¹Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center for Scientific Research “Demokritos”, Athens, Greece

²Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece

Tóm tắt

Textual information in images and video frames constitutes a valuable source of high-level semantics for multimedia indexing and retrieval systems. Text detection is the most crucial step in a multimedia text extraction system and although it has been extensively studied the past decade still, it does not exist a generic architecture that would work for artificial and scene text in multimedia content. In this paper we propose a system for text detection of both artificial and scene text in images and video frames. The system is based on a machine learning stage which uses an Random Forest classifier and a highly discriminative feature set produced by using a new texture operator called Multilevel Adaptive Color edge Local Binary Pattern (MACeLBP). MACeLBP describes the spatial distribution of color edges in multiple adaptive levels of contrast. Then, a gradient-based algorithm is applied to achieve distinction among text lines as well as refinement in the localization of the text lines. The whole algorithm is situated in a multiresolution framework to achieve invariance to scale for the detection of text lines. Finally, an optional connected-component step segments text lines into words based on the distances between the resulting components. The experimental results are produced by applying a concise evaluation methodology and prove the superior performance achieved by the proposed text detection system for artificial and scene text in images and video frames.

Tài liệu tham khảo

Lienhart R, Effelsberg W (2000) Automatic text segmentation and text recognition for video indexing. ACM/Springer Multime´d Sys 8:69–81 Sobottka K, Bunke H, Kronenberg H (1999) Identification of text on colored book and journal covers. International conference on document analysis and recognition, pp 57–63 Wang K, Kangas JA (2003) Character location in scene images from digital camera. Pattern Recognit 36(10):2287–2299 Sato T, Kanade T, Hughes E, and Smith M (1998) Video ocr for digital news archives, IEEE workshop on content-based access of image and video databases, pp 52–60 Anthimopoulos M, Gatos B, Pratikakis I (2007) Multiresolution text detection in video frames. International conference on computer vision theory and applications, pp 161–166 Kim W, Kim C (2009) A new approach for overlay text detection and extraction from complex video scene. IEEE Trans Image Process 18(2):401–411 Chen X, Yang J, Zhang J, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13(1):87–99 Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transforms, IEEE conference on computer vision and pattern recognition, San Francisco Zhong Y, Zhang H, Jain AK (2000) Automatic caption localization in compressed video. IEEE Trans Pattern Anal Machine Intell 22(4):385–392 Crandall D, Antani S, Kasturi R (2003) Extraction of special effects caption text events from digital video. Int J Document Anal Recognit 5(2–3):138–157 Lim Y.K, Choi S.H, and Lee S.W (2000) Text extraction in mpeg compressed video for content-based indexing. International conference on pattern recognition, pp 409–412 Gargi U, Crandall D.J, Antani S, Gandhi T, Keener R, Kasturi R (1999) A system for automatic text detection in video. International conference on document analysis and recognition, pp 29–32 Goto H (2008) Redefining the DCT-based feature for scene text detection: Analysis and comparison of spatial frequency-based features. Int J Document Anal Recognit 11(1):1–8 Chen D, Odobez J-M, Thiran J-P (2004) A localization/verification scheme for finding text in images and videos based on contrast independent features and machine learning methods. Image Commun 19(3):205–217 Ye Q, Huang Q, Gao W, Zhao D (2005) Fast and robust text detection in images and video frames. Image Vision Comput 23(6):565–576 Jung C, Liu Q, Kim J (2009) A stroke filter and its application to text localization. Pattern Recogn Lett 30(2):114–122 Anthimopoulos M, Gatos B, Pratikakis I (2010) A two-stage scheme for text detection in video images. Image Vision Comput 28(9):1413–1426 Ye Q, Jiao J, Huang J, Yu H (2007) Text detection and restoration in natural scene images. J Vis Commun Image Represent 18(6):504–513 Ji R, Xu P, Yao H, Zhang Z, Sun X, Liu T (2008) Directional correlation analysis of local Haar binary pattern for text detection. IEEE International Conference on Multimedia & Expo, pp 885–888 A. Ekin (2006) Information based overlaid text detection by classifier fusion. IEEE international conference on acoustics, speech and signal processing, pp II-753–II-756 Jung K (2001) Neural network-based text location in color images. Pattern Recogn Lett 22(14):1503–1515 Kim KI, Jung K, Park SH, Kim HJ (2001) Support vector machine-based text detection in digital video. Pattern Recogn 34(2):527–529 Wolf C and Jolion J-M (2004) Model Based Text Detection in Images and Videos: a Learning Approach. Technical Report LIRIS-RR-2004-13 Laboratoire d’Informatique en Images et Systemes d’Information, INSA de Lyon, France Lienhart R, Wernicke A (2002) Localizing and segmenting text in images and videos. IEEE Trans Circuits and Systems for Video Technol 12(4):256–268 Li H, Doermann D, Kia O (2000) Automatic Text Detection and Tracking in Digital Video. IEEE Trans Image Process 9(1):147–156 Chen X.R, Yuille A.L (2004) Detecting and reading text in natural scenes. IEEE computer society conference on computer vision and pattern recognition, pp 366–373 Viola PA, Jones MJ (2004) Robust real-time face detection. Int J Comp Vision 57(2):137–154 Ojala T, Pietikainen M, Harwood D (1996) A comparative study of texture measures with classification based on feature distributions. Pattern Recogn 29(1):51–59 Breiman L (2001) Random forests. Machine Learn 45(1):5–32 Tang Y, Krasse S, He Y, Yang W, Alperovitch D (2008) Support vector machines and random forests modeling for spam senders behavior analysis. GLOBECOM, pp 2174–2178 Bosch A, Zisserman A, Munoz X (2007) Image classification using random forests and ferns, 11th IEEE international conference on computer vision, pp 1–8 Otsu N (1979) A threshold selection method from gray-level histograms. IEEE transactions on systems. Man Cybern 9(1):62–66 Lucas S, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) ICDAR 2003 robust reading competitions, ICDAR, pp 682–687 Wolf C, Jolion J (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int J Doc Anal Recognit 8(4):280–296

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA