Squeezing the DCT to Fight Camouflage

Journal of Mathematical Imaging and Vision - Tập 62 - Trang 206-222 - 2019
Marcos Escudero-Viñolo1, Jesus Bescos1
1Video Processing and Understanding Lab, Escuela Politécnica Superior, Universidad Autónoma de Madrid, Madrid, Spain

Tóm tắt

This paper presents a novel descriptor based on the two-dimensional discrete cosine transform (2D DCT) to fight camouflage. The 2D DCT gained popularity in image and video analysis owing to its wide use in signal compression. The 2D DCT is a well-established example to evaluate new techniques in sparse representation and is widely used for block and texture description, mainly due to its simplicity and its ability to condense information in a few coefficients. A common approach, for different applications, is to select a subset of these coefficients, which is fixed for every analyzed signal. In this paper, we question this approach and propose a novel method to select a signal-dependent subset of relevant coefficients, which is the basis for the proposed R-DCT and sR-DCT descriptors. As we propose to describe each pixel with a different set of coefficients, each associated to a particular basis function, in order to compare any two so-obtained descriptors a distance function is required: we propose a novel metric to cope with this situation. The presented experiments over the change detection dataset show that the proposed descriptors notably reduce the likelihood of camouflage respect to other popular descriptors: 92% respect to the pixel luminance, 82% respect to the RGB values, and 65% respect to the best performing LBP configuration.

Tài liệu tham khảo

Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Trans. Comput. 23(1), 90–93 (1974) Ajmera, P.K., Jadhav, D.V., Holambe, R.S.: Text-independent speaker identification using radon and discrete cosine transforms based features from speech spectrogram. Pattern Recognit. 44, 2749–2759 (2011) Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011) Babu, R., Ramakrishnan, K., Srinivasan, S.: Video object segmentation: a compressed domain approach. IEEE Trans. Circuits Syst. Video Technol. 14(4), 462–474 (2004) Barnich, O., Van Droogenbroeck, M.: Vibe: a universal background subtraction algorithm for video sequences. IEEE Trans. Image Process. 20(6), 1709–1724 (2011) Benedek, C., Sziranyi, T.: Study on color space selection for detecting cast shadows in video surveillance. Int. J. Imaging Syst. Technol. 17, 190–201 (2007) Bescos, J.: Real-time shot change detection over online mpeg-2 video. IEEE Trans. Circuits Syst. Video Technol. 14(4), 475–484 (2004) Bhaskar, H., Mihaylova, L., Achim, A.: Video foreground detection based on symmetric alpha-stable mixture models. IEEE Trans. Circuits Syst. Video Technol. 20(8), 1133–1138 (2010) Bouwmans, T.: Traditional and recent approaches in background modeling for foreground detection: an overview. Comput. Sci. Rev. 11, 31–66 (2014) Dalal, N., Triggs, B.: A benchmarking framework for background subtraction in RGBD videos. In: International Conference on Image Analysis and Processing, vol. 1, pp. 219–229. Springer (2017) Chen, H., Reiss, P.T., Tarpey, T.: Optimally weighted \(L^{2}\) distance for functional data. Biometrics 70(3), 516–525 (2014) Cheng, H., Liu, Z., Yang, L., Chen, X.: Sparse representation and learning in visual recognition: theory and applications. Signal Process. 93(6), 1408–1425 (2013). Special issue on machine learning in intelligent image processing Conte, D., Foggia, P., Percannella, G., Tufano, F., Vento, M.: An algorithm for recovering camouflage errors on moving people. In: Bai, X., Hancock, E.R., Ho, T.K., Wilson, R.C., Biggio, B., Robles-Kelly, A. (eds.) Structural, Syntactic, and Statistical Pattern Recognition, pp. 365–374. Springer, Berlin, Heidelberg (2010) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893 (2005) Dorudian, N., Lauria, S., Swift, S.: Moving object detection using adaptive blind update and RGB-D camera. IEEE Sens. J. 19(18), 8191–8201 (2019) Drimbarean, A., Whelan, P.F.: Experiments in colour texture analysis. Pattern Recognit. Lett. 22(10), 1161–1167 (2001) Elad, M.: Sparse modeling in image processing and deep learning (Keynote Talk). IEEE SigPort (2017). http://sigport.org/2259. Accessed 20 Nov 2019 Elad, M., Figueiredo, M.A.T., Ma, Y.: On the role of sparse and redundant representations in image processing. Proc. IEEE 98(6), 972–982 (2010) Escudero, M., Tiburzi, F., Bescos, J.: Mpeg video object segmentation under camera motion and multimodal backgrounds. In: 2008 15th IEEE International Conference on Image Processing, ICIP, pp. 2668–2671 (2008) Felzenszwalb, P. F., Girshick, R. B., McAllester, D.: Cascade object detection with deformable part models. In: 2010 IEEE conference on Computer vision and pattern recognition (CVPR), pp. 2241–2248 (2010) Fu, J., Lee, S., Wong, S., Yeh, J., Wang, A., Wu, H.: Image segmentation feature selection and pattern classification for mammographic microcalcifications. Comput. Med. Imaging Gr. 29(6), 419–429 (2005) Gao, Y., Ma, J., Yuille, A.L.: Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans. Image Process. 26(5), 2545–2560 (2017) Graps, A.: An introduction to wavelets. IEEE Comput. Sci. Eng. 2(2), 50–61 (1995) De Gregorio, M., Giordano, M.: CwisarDH\(^+\): background detection in RGBD videos by learning of weightless neural networks. In: International Conference on Image Analysis and Processing, pp. 1242–253. Springer (2017) Guleryuz, O.G.: Weighted averaging for denoising with overcomplete dictionaries. IEEE Trans. Image Process. 16(12), 3020–3034 (2007) Heikkilä, M., Pietikäinen, M.: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 657–662 (2006) Hernández, E., Weiss, G.: A First Course on Wavelets. CRC Press, Boca Raton (1996) Hong, X., Zhao, G., Pietikäinen, M., Chen, X.: Combining LBP difference and feature correlation for texture description. IEEE Trans. Image Process. 23(6), 2557–2568 (2014) Hu, W., Yang, Y., Zhang, W., Xie, Y.: Moving object detection using tensor-based low-rank and saliently fused-sparse decomposition. IEEE Trans. Image Process. 26(2), 724–737 (2017) Ivanov, Y., Bobick, A., Liu, J.: Fast lighting independent background subtraction. Int. J. Comput. Vis. 37(2), 199–207 (2000) Javed, S., Bouwmans, T., Sultana, M., Jung, S.K.: Moving object detection on RGB-D videos using graph regularized spatiotemporal RPCA. In: International Conference on Image Analysis and Processing, pp. 1230–1241. Springer (2017) Ji, S., Park, H.W.: Moving object segmentation in DCT-based compressed video. Electron. Lett. 36(21), 1769–1770 (2000) Kim, S., Paeng, K., Seo, J.W., Kim, S.D.: Bi-DCT: DCT-based local binary descriptor for dense stereo matching. IEEE Signal Process. Lett. 22(7), 847–851 (2015) Le Gall, D.: MPEG: a video compression standard for multimedia applications. Commun. ACM 34(4), 46–58 (1991) Li, B., Yuan, C., Xiong, W., Hu, W., Peng, H., Ding, X., Maybank, S.: Multi-view multi-instance learning based on joint sparse representation and multi-view dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2554–2560 (2017) Li, L., Huang, W., Gu, I.Y.-H., Tian, Q.: Statistical modeling of complex backgrounds for foreground object detection. IEEE Trans. Image Process. 13(11), 1459–1472 (2004) Li, S., Florencio, D., Li, W., Zhao, Y., Cook, C.: A fusion framework for camouflaged moving foreground detection in the wavelet domain. ArXiv e-prints (2018) Li, Z., Jiang, P., Ma, H., Yang, J., Tang, D.: A model for dynamic object segmentation with kernel density estimation based on gradient features. Image Vis Comput 27(6), 817–823 (2009) Lienhart, R. W.: Comparison of automatic shot boundary detection algorithms. In: Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, vol. 3656, pp. 290–301 (1998) Liu, Z., Huang, K., Tan, T.: Foreground object detection using top-down information based on EM framework. IEEE Trans. Image Process. 21(9), 4204–4217 (2012) Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004) Maddalena, L., Petrosino, A.: Exploiting color and depth for background subtraction. In: International Conference on Image Analysis and Processing, vol. 1, pp. 254–265. Springer (2017) Maddalena, L., Petrosino, A.: The SOBS algorithm: what are the limits? In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vol. 1, pp. 21–26. IEEE (2012) Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of 2001 Eighth IEEE International Conference on Computer Vision, ICCV 2001, vol. 2, pp. 416–423 (2001) Mendez-Vazquez, H., Garcia-Reyes, E., Condes-Molleda, Y.: A new combination of local appearance based methods for face recognition under varying lighting conditions. In: Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications, pp. 535–542. Springer (2008) Mezaris, V., Kompatsiaris, I., Boulgouris, N., Strintzis, M.: Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval. IEEE Trans. Circuits Syst. Video Technol. 14(5), 606–621 (2004) Minematsu, T., Shimada, A., Uchiyama, H., Taniguchi, R.: Simple combination of appearance and depth for foreground segmentation. In: International Conference on Image Analysis and Processing, vol. 1, pp. 266–277. Springer (2017) Nanni, L., Lumini, A.: Coding of amino acids by texture descriptors. Artif. Intell. Med. 48(1), 43–50 (2010) Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002) Paclik, P., Duin, R., van Kempen, G., Kohlus, R.: Supervised segmentation of textures in backscatter images. In: Proceedings of 2002 16th International Conference on Pattern Recognition, vol. 2, pp. 490–493 (2002) Porikli, F., Bashir, F., Sun, H.: Compressed domain video object segmentation. IEEE Trans. Circuits Syst. Video Technol. 20(1), 2–14 (2010) Qian, X., Hua, X.-S., Chen, P., Ke, L.: PLBP: an effective local binary patterns texture descriptor with pyramid representation. Pattern Recognit. 44, 2502–2515 (2011) Ramsay, J.O.: Functional Data Analysis. Wiley, New York (2004) Randen, T., Husoy, J.: Filtering for texture classification: a comparative study. IEEE Trans. Pattern Anal. Mach. Intell. 21(4), 291–310 (1999) Rao, K.R., Yip, P.: Discrete Cosine Transform: Algorithms, Advantages, Applications. Academic press, Cambridge (2014) Reddy, V., Sanderson, C., Lovell, B. C.: Robust foreground object segmentation via adaptive region-based background modelling. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 3939–3942 (2010) Reddy, V., Sanderson, C., Lovell, B.C.: A low-complexity algorithm for static background estimation from cluttered image sequences in surveillance contexts. J. Image Video Process. 2011, 1 (2011) Satopaa, V., Albrecht, J., Irwin, D., Raghavan, B.: Finding a “kneedle” in a haystack: detecting knee points in system behavior. In: 2011 31st International Conference on Distributed Computing Systems Workshops, pp. 166–171 (2011) St-Charles, P.-L., Bilodeau, G.-A., Bergevin, R.: Subsense: a universal change detection method with local adaptive sensitivity. IEEE Trans. Image Process. 24(1), 359–373 (2015) Starck, J.-L., Murtagh, F., Fadili, J.M.: Sparse Image and Signal Processing: Wavelets, Curvelets, Morphological Diversity. Cambridge University Press, Cambridge (2010) Ahonen, T., Matas, J., He, C., Pietikäinen. M.: Rotation invariant image description with local binary pattern histogram Fourier features. In: Proceedings of Image Analysis, SCIA: Lecture Notes in Computer Science, vol. 5575, pp. 61–70 (2009) Tomita, F., Tsuji, S.: Computer Analysis of Visual Textures, vol. 102. Springer, Berlin (2013) Toyama, K., Krumm, J., Brumitt, B., Meyers, B.: Wallflower: principles and practice of background maintenance. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, vol. 1, pp. 255–261. IEEE (1999) Tsai, D.-M., Chiu, W.-Y.: Motion detection using Fourier image reconstruction. Pattern Recognit. Lett. 29(16), 2145–2155 (2008) Varcheie, P., Sills-Lavoie, M., Bilodeau, G.-A.: An efficient region-based background subtraction technique. In: 2008 Canadian Conference on Computer and Robot Vision, CRV ’08, pp. 71 –78 (2008) Wang, H., Divakaran, A., Vetro, A., Chang, S.-F., Sun, H.: Survey of compressed-domain features used in audio-visual indexing and analysis. J. Vis. Commun. Image Represent. 14(2), 150–183 (2003) Wang, R., Bunyak, F., Seetharaman, G., Palaniappan, K.: Static and moving object detection using flux tensor with split gaussian models. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 420–424 (2014) Wang, Y., Jodoin, P.-M., Porikli, F., Konrad, J., Benezeth, Y., Ishwar, P.: CDnet 2014: an expanded change detection benchmark data-set. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 393–400. IEEE (2014) Wang, Z., Bovik, A.C.: Mean squared error: love it or leave it? A new look at signal fidelity measures. IEEE Signal Process. Mag. 26(1), 98–117 (2009) Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004) Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009) Xu, Y., Huang, S., Ji, H., Fermüller, C.: Scale-space texture description on SIFT-like textons. Comput. Vis. Image Underst. 116(9), 999–1013 (2012) Zamir, A.R., Sax, A., Shen, W., Guibas, L.J., Malik, J., Savarese, S.: Taskonomy: disentangling task transfer learning. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, pp. 13712-3722. IEEE (2018) Zeng, D., Zhu, M.: Background subtraction using multiscale fully convolutional network. IEEE Access 6, 16010–16021 (2018) Zha, Y., Bi, D., Yang, Y.: Learning complex background by multi-scale discriminative model. Pattern Recognit. Lett. 30, 1003–1014 (2009) Zhang, H., Xu, D.: Fusing color and texture features for background model. In: Fuzzy Systems and Knowledge Discovery: Third International Conference, FSKD 2006, Xian, China, September 24–28, 2006. Proceedings, pp. 887–893. Springer (2006) Zhang, X., Zhu, C., Wang, S., Liu, Y., Ye, M.: A Bayesian approach to camouflaged moving object detection. IEEE Trans. Circuits Syst. Video Technol. 27(9), 2001–2013 (2017) Zhang, Z., Xu, Y., Yang, J., Li, X., Zhang, D.: A survey of sparse representation: algorithms and applications. IEEE Access 3, 490–530 (2015)