Random forest for intermediate descriptor fusion in shot boundary detection
Tóm tắt
Shot boundary detection is the fundamental part in many real applications as video retrieval and so on. This paper tackles the problem of video segment obtaining in complex movie videos. Firstly, intermediate descriptor is proposed to depict the variation of both abrupt and gradual change in shot boundaries, which is formed by distance vector on Local Binary Pattern (LBP), GIST (GIST) or their fusion. Instead of just using the adjacent frames distance, intermediate descriptor keeps the distances between current frame and consecutive frames. It comprehensively characterizes local temporal structure, which is especially important for gradual change. For the excellent ability for feature fusion in random forests, it is adopted here to verify the fusion effect of intermediate descriptor on LBP and GIST. The whole experiments are designed on the subset of TRECVid 2013 INS (INstance Search) task to verify the effectiveness of proposed intermediate descriptor and the fusion ability for random forest. Compared with static and adaptive thresholds approaches, the best performance can be achieved by post-fusion of intermediate descriptor on LBP and GIST.
Tài liệu tham khảo
A. F. Smeaton, P. Over, and A. R. Doherty. Video shot boundary detection: Seven years of trecvid activity. Computer Vision and Image Understanding, 114(2010) 4, 411–418.
P. P. Mohanta, S. K. Saha, and B. Chanda. A model-based shot boundary detection technique using frame transition parameters. IEEE Transactions on Multimedia, 14(2012)1, 223–233.
H. Zhang, A. Kankanhalli, and S. W. Smoliar. Automatic partitioning of full-motion video. Multimedia Systems, 1(1993)1, 10–28.
C.-L. Huang and B.-Y. Liao. A robust scene-change detection method for video segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 11(2001)12, 1281–1288.
C. Grana and R. Cucchiara. Linear transition detection as a unified shot detection approach. IEEE Transactions on Circuits and Systems for Video Technology, 17(2007)4, 483.
M. Cooper and J. Foote. Discriminative techniques for keyframe selection. IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherland, July 2005, 4–9.
Y. Murai and H. Fujiyoshi. Shot boundary detection using co-occurrence of global motion in video stream. IEEE International Conference on Pattern Recognition, USA, December 2008, 1–4.
M. Cooper, T. Liu, and E. Rieffel. Video segmentation via temporal pattern classification. IEEE Transactions on Multimedia, 9(2007)3, 610–618.
A. Bosch, A. Zisserman, and X. Munoz. Image classification using random forests and ferns. IEEE International Conference on Computer Vision, Brazil, October 2007, 4–9.
A. Criminisi and J. Shotton. Decision Forests for Computer Vision and Medical Image Analysis. Springer London Ltd, 2013, 211–295.
Z. Li and L. Itti. Saliency and GIST features for target detection in satellite images. IEEE Transactions on Image Processing, 20(2011)7, 2017–2029.
K. Ni, A. Kannan, A. Criminisi, et al.. Epitomic location recognition. IEEE Conference on Computer Vision and Pattern Recognition, USA, June 2008, 1–8.
X. Wang, T. X. Han, and S. Yan. An hog-lbp human detector with partial occlusion handling. IEEE International Conference on Computer Vision, Japan, September 2009, 32–39.
T. Ojala, M. Pietikäinen, and D. Harwood. A comparative study of texture measures with classification based on featured distributions. Pattern Recognition, 29(1996)1, 51–59.
A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(2001)3, 145–175.
T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2002)7, 971–987.
L. Breiman. Random forests. Machine learning, 45 (2001)1, 5–32.