Multivariate time series modeling of geometric features of spatio-temporal volumes for content based video retrieval

Chiranjoy Chattopadhyay1, Amit Kumar Maurya1
1Indian Institute of Technology Madras, Chennai, India

Tóm tắt

In this paper, we address the problem of Content Based Video Retrieval using a multivariate time series modeling of features. We particularly focus on representing the dynamics of geometric features on the Spatio-Temporal Volume (STV) created from a real world video shot. The STV intrinsically holds the video content by capturing the dynamics of the appearance of the foreground object over time, and hence can be considered as a dynamical system. We have captured the geometric property of the parameterized STV using the Gaussian curvature computed at each point on its surface. The change of Gaussian curvature over time is then modeled as a Linear Dynamical System (LDS). Due to its capability to efficiently model the dynamics of a multivariate signal, Auto Regressive Moving Average (ARMA) model is used to represent the time series data. Parameters of the ARMA model are then used for video content representation. To discriminate between a pair of video shots (time series), we have used the subspace angle between a pair of feature vectors formed using ARMA model parameters. Experiments are done on four publicly available benchmark datasets, shot using a static camera. We present both qualitative and quantitative analysis of our proposed framework. Comparative results with three recent works on video retrieval also show the efficiency of our proposed framework.

Tài liệu tham khảo

Aggarwal G, Chowdhury A, Chellappa R (2004) A system identification approach for video-based face recognition. In: ICPR, pp 175–178 Auguste R, El Ghini A, Bilasco M, Ihaddadene N, Djeraba C (2010) Motion similarity measure between video sequences using multivariate time series modeling. In: ICMWI, pp 292–296 Babu RV, Ramakrishnan KR (2007) Compressed domain video retrieval using object and global motion descriptors. Multiméd Tools Appl 32(1):93–113 Barnich O, Van Droogenbroeck M (2011) ViBe: a universal background subtraction algorithm for video sequences. IEEE Trans Image Process 20(6):1709–1724 Bashir FI, Member S, Khokhar AA, Member S, Schonfeld D, Member S (2007) Real-time motion trajectory-based indexing and retrieval of video sequences. IEEE Trans Multiméd 9:58–65 Bissacco A, Chiuso A, Ma Y, Soatto S (2001) Recognition of human gaits. In: CVPR, vol 2, pp 52–57 Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. Trans Pattern Anal Mach Intell 23(3):257–267 Brendel W, Todorovic S (2010) Activities as time series of human postures. In: ECCV, pp 721–734 Chattopadhyay C, Das S (2012) A novel hyperstring based descriptor for an improved representation of motion trajectory and retrieval of similar video shots with static camera. In: Emerging Area in Information Technology (EAIT) Chattopadhyay C, Das S (2012) Enhancing the MST-CSS representation using robust geometric features, for efficient content based video retrieval (CBVR). In: ISM Chellappa R, Sankaranarayanan AC, Veeraraghavan A, Turaga P (2010) Statistical methods and models for video-based tracking, modeling, and recognition. Found Trends Signal Process 3:1–151 Chen CC, Ryoo MS, Aggarwal JK (2010) UT-tower dataset: aerial view activity classification challenge. http://cvrc.ece.utexas.edu/SDHA2010/Aerial_View_Activity.html Chen PY, Chen ALP (2003) Video retrieval based on video motion tracks of moving objects. In: Proceedings of SPIE, vol 5307, pp 550–558 Cui B, Zhao Z, Tok WH (2012) A framework for similarity search of time series cliques with natural relations. IEEE Trans Knowl Data Eng 24(3):385–398 Das S, Chattopadhyay C, Dyana A (2011) Vidlookup: a web-based online CBVR system for query video shots. Demo at ICCV Deng Y, Manjunath BS (1998) NeTra-V: toward an object-based video representation. IEEE Trans Circuits Syst Video Technol 8:616–627 Doretto G, Chiuso A, Wu YN, Soatto S (2003) Dynamic textures. Int J Compu Vis 51:91–109 Dyana A, Das S (2009) Trajectory representation using Gabor features for motion-based video retrieval. Pattern Recogn Lett 30: 877–892 Erol B, Kossentini F (2005) Shape-based retrieval of video objects. IEEE Trans Multiméd 7:179–182 Florez OU, Lim S (2009) Discovery of time series in video data through distribution of spatiotemporal gradients. ACM symposium on applied computing Gao HP, Yang ZQ (2010) Content based video retrieval using spatiotemporal salient objects. In: Intelligence Information Processing and Trusted Computing (IPTC) Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Machine Intell 29(12):2247–2253 Hsieh JW, Yu SL, Chen YS (2006) Motion-based video retrieval by trajectory matching. IEEE Trans Circuits Syst Video Technol 16:396–409 Lee SL, Chun SJ, Kim DH, Lee JH, Chung CW (2000) Similarity search for multidimensional data sequences. In: Proceeding of ICDE Liang B, Xiao W, Liu X (2012) Design of video retrieval system using MPEG-7 descriptors. Procedia Eng 29:2578–2582 Lin J, Li Y (2009) Finding structural similarity in time series data using bag-of-patterns representation. In: SSDBM, 2009, pp 461–477 Ma Y, Zhang H (2002) Motion texture: a new motion based video representation. In: International conference of pattern recognition Martin R (2000) A metric for ARMA processes. IEEE Trans Signal Process 48(4):1164–1170 Meyers D, Skinner S, Sloan K (1992) Surfaces from Contours. ACM Trans Graphics 11(3):228–258 O’Neill B (1997) Elementary differential geometry, 2nd edn. Academic Press, New York. http://www.apnet.com Popivanov I, Miller RJ (2002) Similarity search over time series data using wavelets. Proceedings of the 18th ICDE, pp 212–221 Rodriguez MD, Ahmed J, Shah M (2008) Action MACH: a spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of CVPR Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. ICPR Scott C, Nowak R (2006) Robust contour matching via the order-preserving assignment problem. IEEE Trans Image Process 15(7):1831–1838 Srivastava A, Turaga P, Kurtek S (2012) On advances in differential-geometric approaches for 2D and 3D shape analyses and activity recognition. IVC 30:398–416 Turaga P, Veeraraghavan A, Srivastava A, Chellappa R (2011) Statistical computations on Grassmann and Stiefel Manifolds for image and video-based recognition. IEEE Trans Pattern Anal Mach Intell 33(11):2273–2286 Turaga PK, Veeraraghavan A, Srivastava A, Chellappa R (2011) Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition. IEEE Trans Pattern Anal Mach Intell 33(11):2273–2286 VPLab-VID: http://www.cse.iitm.ac.in/vplab/videos.html Yilmaz A, Shah M (2008) A differential geometric approach to representing the human actions. Comput Vis Image Underst 109(3):335–351 Zhang D, Zuo W, Zhang D, Zhang H (2010) Time series classification using support vector machine with gaussian elastic metric kernel. In: Proceeding of ICPR, pp 29–32