Unsupervised Action Classification Using Space-Time Link Analysis
Tóm tắt
We address the problem of unsupervised discovery of action classes in video data. Different from all existing methods thus far proposed for this task, we present a space-time link analysis approach which consistently matches or exceeds the performance of traditional unsupervised action categorization methods in various datasets. Our method is inspired by the recent success of link analysis techniques in the image domain. By applying these techniques in the space-time domain, we are able to naturally take into account the spatiotemporal relationships between the video features, while leveraging the power of graph matching for action classification. We present a comprehensive set of experiments demonstrating that our approach is capable of handling cluttered backgrounds, activities with subtle movements, and video data from moving cameras. State-of-the-art results are reported on standard datasets. We also demonstrate our method in a compelling surveillance application with the goal of avoiding fraud in retail stores.
Tài liệu tham khảo
Wang X, Ma X, Grimson E: Unsupervised activity perception by hierarchical bayesian models. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007, Minneapolis, Minn, USA
Krueger V, Kragic D, Ude A, Geib C: The meaning of action: a review on action recognition and mapping. International Journal on Advanced Robotics 2007,21(13):1473-1501.
Moeslund TB, Hilton A, Krüger V: A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding 2006,104(2-3):90-126. 10.1016/j.cviu.2006.08.002
Niebles JC, Fei-Fei L: A hierarchical model of shape and appearance for human action classification. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007, Minneapolis, Minn, USA
Niebles JC, Wang H, Fei-Fei L: Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision 2008,79(3):299-318. 10.1007/s11263-007-0122-4
Savarese S, Del Pozo A, Niebles JC, Fei-Fei L: Spatial-temporal correlations for unsupervised action classification. Proceedings of the IEEE Workshop on Motion and Video Computing, 2008, Copper Mountain, Colo, USA
Wang X, Ma KT, Ng G-W, Grimson WEL: Trajectory analysis and semantic region modeling using a nonparametric bayesian model. Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), 2008
Hofmann T: Probabilistic latent semantic analysis. Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence, 1999, Stockholm, Sweden
Blei DM, Ng AY, Jordan MI: Latent dirichlet allocation. Journal of Machine Learning Research 2003,3(4-5):993-1022.
Teh YW, Jordan MI, Beal MJ, Blei DM: Hierarchical Dirichlet processes. Journal of the American Statistical Association 2006,101(476):1566-1581. 10.1198/016214506000000302
Brin S, Page L: The anatomy of a large-scale hypertextual web search engine. Proceedings of the 7th International Conference on World Wide Web, 1998 7: 107-117.
Kim G, Faloutsos C, Hebert M: Unsupervised modeling of object categories using link analysis techniques. Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), June 2008, Anchorage, Alaska, USA
Kim G, Faloutsos C, Hebert M: Unsupervised modeling and recognition of object categories with combination of visual contents and geometric similarity links. Proceedings of the 1st International ACM Conference on Multimedia Information Retrieval (MIR '08), 2008, British Columbia, Canada 419-426.
Leordeanu M, Hebert M: A spectral technique for correspondence problems using pairwise constraints. Proceedings of the IEEE International Conference on Computer Vision, 2005, Beijing, China 2: 1482-1489.
Kuhn HW: The hungarian method for the assignment problem. Naval Research Logistics Quarterly 1955., 2:
Belongie S, Malik J, Puzicha J: Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 2002,24(4):509-522. 10.1109/34.993558
Blondel VD, Gajardo A, Heymans M, Senellart P, Van Dooren P: A measure of similarity between graph vertices: applications to synonym extraction and web searching. SIAM Review 2004,46(4):647-666. 10.1137/S0036144502415960
Najork M, Craswell N: Efficient and effective link analysis with precomputed SALSA maps. Proceedings of the International Conference on Information and Knowledge Management, 2008, Napa Valley, Calif, USA 53-62.
Thelwall M: Link Analysis: An Information Science Approach. Academic Press, San Diego, Calif, USA; 2004.
Turaga BK, Chellappa R, Subrahmanian VS, Udrea O: Machine recognition of human activities: a survey. IEEE Transactions on Circuits and Systems for Video Technology 2008,18(11):1473-1488.
Wang L, Hu W, Tan T: Recent developments in human motion analysis. Pattern Recognition 2003,36(3):585-601. 10.1016/S0031-3203(02)00100-0
Ramanan D, Forsyth A: Automatic annotation of everyday movements. Proceedings of the Neural Information Processing Systems (NIPS '03), 2003, Washington, DC, USA
Fanti C, Zelnik-Manor L, Perona P: Hybrid models for human motion recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05), 2005, San Diego, Calif, USA 1: 1166-1173.
Gavrila DM: The visual analysis of human movement: a survey. Computer Vision and Image Understanding 1999,73(1):82-98. 10.1006/cviu.1998.0716
Ikizler N, Forsyth D: Searching video for complex activities with finite state models. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007, Minneapolis, Minn, USA
Bobick AF, Davis JW: The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence 2001,23(3):257-267. 10.1109/34.910878
Efros AA, Berg AC, Mori G, Malik J: Recognizing action at a distance. Proceedings of the IEEE International Conference on Computer Vision, 2003, Nice, France 2: 726-733.
Blank M, Gorelick L, Shechtman E, Irani M, Basri R: Actions as space-time shapes. Proceedings of the IEEE International Conference on Computer Vision, 2005, Beijing, China 2: 1395-1402.
Shechtman E, Irani M: Space-time behavior based correlation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, San Diego, Calif, USA 1: 405-412.
Rodriguez MD, Ahmed J, Shah M: Action MACH: a spatio-temporal maximum average correlation height filter for action recognition. Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), 2008, Anchorage, Alaska, USA
Brand M, Oliver N, Pentland A: Coupled hidden Markov models for complex action recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997, San Juan, Puerto Rico, USA 994-999.
Oliver N, Garg A, Horvitz E: Layered representations for learning and inferring office activity from multiple sensory channels. Computer Vision and Image Understanding 2004,96(2):163-180. 10.1016/j.cviu.2004.02.004
Bobick AF, Ivanov YA: Action recognition using probabilistic parsing. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1998, anta Barbara, Calif, USA 196-202.
Quattoni A, Wang S, Morency L-P, Collins M, Darrell T: Hidden conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 2007,29(10):1848-1853.
Laptev I, Lindeberg T: Space-time interest points. Proceedings of the IEEE International Conference on Computer Vision, 2003, Nice, France 1: 432-439.
Dollar P, Rabaud V, Cottrellm G, Belongie S: Behavior recognition via sparse spatiotemporal features. Proceedings of the IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (PETS '05), 2005, Beijing, China
Schüldt C, Laptev I, Caputo B: Recognizing human actions: a local SVM approach. Proceedings of the International Conference on Pattern Recognition, 2004, Cambridge, UK 3: 32-36.
Song Y, Chen W-Y, Bai H, Lin C-J, Chang EY: Parallel spectral clustering. Proceedings of the European Conference on Machine Learning(ECML '08), 2008, Beijing, China
Wang Y, Jiang H, Drew MS, Li Z-N, Mori G: Unsupervised discovery of action classes. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06), 2006, New York, NY, USA 2: 1654-1661.
Cour T, Srinivasan P, Shi J: Balanced graph matching. Proceedings of the Advances in Neural Information Processing Systems (NIPS '06), 2006, Cambridge, Mass, USA
