Human activity recognition from 3D data: A review

Pattern Recognition Letters - Tập 48 - Trang 70-80 - 2014
J. K. Aggarwal1, Lu Xia2
1The University of Texas at Austin, Austin, TX 78705, USA
2University of Texas at Austin#TAB#

Tóm tắt

Từ khóa


Tài liệu tham khảo

Aggarwal, 2011, Human activity analysis: a review, ACM Comput. Surv. (CSUR), 43, 16, 10.1145/1922649.1922653

M. Ahmad, S.W. Lee, Hmm-based human action recognition using multiview image sequences in: ICPR, (2006) 263–266.

Argyriou, 2010, Photometric stereo with an arbitrary number of illuminants, CVIU, 114, 887

Arman, 1993, Model-based object recognition in dense-range images review, ACM Comput. Surv. (CSUR), 25, 5, 10.1145/151254.151255

G. Ballin, M. Munaro, E. Menegatti, Human action recognition from rgb-d frames based on real-time 3d optical flow estimation, in: Biologically Inspired Cognitive Architectures. Springer, 2013, pp. 65–74.

Barron, 1994, Performance of optical flow techniques, IJCV, 12, 43, 10.1007/BF01420984

Ben-Arie, 2002, Human activity recognition using multidimensional indexing, Pattern Anal. Mach. Intell. IEEE Trans., 24, 1091, 10.1109/TPAMI.2002.1023805

Blinov, 2005, Reconstruction of 3-d horizons from 3-d seismic datasets, Geosci. Remote Sens., 43, 1421, 10.1109/TGRS.2005.844731

V. Bloom, D. Makris, V. Argyriou, G3d: a gaming action dataset and real time action recognition evaluation framework, in: CVPRW, 2012, pp. 7–12.

J. Cech, J. Sanchez-Riera, R. Horaud, Scene flow estimation by growing correspondence seeds, in: CVPR, IEEE. 2011, pp. 3129–3136.

R. Chaudhry, A. Ravichandran, G. Hager, R. Vidal, Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions, in: CVPR, IEEE, 2009, pp. 1932–1939.

Chen, 2013, A survey of human motion analysis using depth imagery, Pattern Recognit. Lett., 10.1016/j.patrec.2013.02.006

Chen, 2012, Occluded human action analysis using dynamic manifold model, ICPR, IEEE, 1245

M.Y. Chen, A. Hauptmann, Mosift: Recognizing human actions in surveillance videos, 2009.

Z. Cheng, L. Qin, Y. Ye, Q. Huang, Q. Tian, Human daily action analysis with multi-view and color-depth data, in: ECCV Workshops and Demonstrations, Springer, 2012, pp. 52–61.

Chiuso, 2002, Structure from motion causally integrated over time, Pattern Anal. Mach. Intell. IEEE Trans., 24, 523, 10.1109/34.993559

Christmas, 1995, Structural matching in computer vision using probabilistic relaxation, PAMI, 17, 749, 10.1109/34.400565

Cohen, 2003, Inference of human postures by classification of 3d human body shape, AMFG Workshop, IEEE, 74

N. Dalal, B. Triggs, C. Schmid, Human detection using oriented histograms of flow and appearance, in: ECCV. Springer, 2006, pp. 428–441.

Darrell, 2000, Integrated person tracking using stereo, color, and pattern detection, IJCV, 37, 175, 10.1023/A:1008103604354

J.W. Davis, A.F. Bobick, The representation and recognition of human movement using temporal templates, in: CVPR, IEEE, 1997, pp. 928–934.

L. Deng, H. Leung, N. Gu, Y. Yang, Generalized model-based human motion recognition with body partition index maps, in: Computer Graphics Forum, Wiley Online Library, 2012, pp. 202–215.

Dhond, 1989, Structure from stereo-a review, Syst. Man Cybern. IEEE Trans., 19, 1489, 10.1109/21.44067

Dollár, 2005, Behavior recognition via sparse spatio-temporal features, PETS, 65

M. Domínguez-Morales, A. Jiménez-Fernández, R. Paz-Vicente, A. Linares-Barranco, G. Jiménez-Moreno, 2012. Stereo matching: From the basis to neuromorphic engineering.

Fanello, 2013, Keep it simple and sparse: real-time action recognition, J. Mach. Learn. Res., 14, 2617

P. Favaro, S. Soatto, Learning shape from defocus, in: ECCV. Springer, 2002, pp. 735–745.

H. Fujiyoshi, A.J. Lipton, Real-time human motion analysis by image skeletonization, in: Applications of Computer Vision, 1998. WACV’98. Proceedings., Fourth IEEE Workshop on, IEEE, 1998, pp. 15–21.

D. Gehrig, T. Schultz, Selecting relevant features for human motion recognition, in: ICPR, IEEE, 2008.

Georgis, 1998, Error guided design of a 3d vision system, PAMI, 20, 366, 10.1109/34.677262

J.J. Gibson, The perception of the visual world, 1950.

Grzeszcuk, 2000, Stereo based gesture recognition invariant to 3d pose and lighting, CVPR, 826

M. Harville, D. Li, Fast, integrated person tracking and activity recognition with plan-view templates from a single stereo camera, in: CVPR, 2004.

A. Hernández-Vela, M. Bautista, X. Perez-Sala, V. Ponce, X. Baró, O. Pujol, C. Angulo, S. Escalera, Bovdw: Bag-of-visual-and-depth-words for gesture recognition, in: ICPR, 2012.

C. Hesher, A. Srivastava, G. Erlebacher, A novel technique for face recognition using range imaging, in: Signal processing and its applications, Seventh international symposium on, IEEE, 2003, pp. 201–204.

Holte, 2010, View-invariant gesture recognition using 3d optical flow and harmonic motion context, CVIU, 114, 1353

M.B. Holte, T.B. Moeslund, N., Nikolaidis, I. Pitas, 3d human action recognition for multi-view camera systems, in: 3DIMPVT, 2011, pp. 342–349.

Horn, 1981, Determining optical flow, Artif. Intell., 17, 185, 10.1016/0004-3702(81)90024-2

F. Huguet, F. Devernay, A variational method for scene flow estimation from stereo sequences, in: ICCV, IEEE, 2007, pp. 1–7.

Jalal, 2011, Recognition of human home activities via depth silhouettes and R transformation for smart homes, Indoor Built Environ., 1

H. Jin, P. Favaro, A variational approach to shape from defocus, in: ECCV. Springer, 2002, pp. 18–30.

G. Johansson, Visual motion perception. Scientific American, 1975.

T. Kanade, A. Yoshida, K. Oda, H. Kano, M. Tanaka, A stereo machine for video-rate dense depth mapping and its new applications, in: CVPR, IEEE. 1996, pp. 196–202.

A. Klaser, M. Marszalek, A spatio-temporal descriptor based on 3d-gradients, in: BMVC, 2008.

Kovalev, 1999, Texture anisotropy in 3-d images, Image Proces. IEEE Trans., 8, 346, 10.1109/83.748890

Kulić, 2008, Incremental learning, clustering and hierarchy formation of whole body motion patterns using adaptive hidden markov chains, Int. J. Rob. Res., 27, 761, 10.1177/0278364908091153

A. Kurakin, Z. Zhang, Z. Liu, A real time system for dynamic hand gesture recognition with a depth sensor, 2012.

Laptev, 2005, On space-time interest points, Int. J. Comput. Vis., 64, 107, 10.1007/s11263-005-1838-7

Laptev, 2008, Learning realistic human actions from movies, CVPR, 1

J. Lei, X. Ren, D. Fox, Fine-grained kitchen activity recognition using rgb-d 2012.

A. Letouzey, B. Petit, E. Boyer, M. Team, Scene flow from depth and color images, in: Jesse Hoey, Stephen McKenna and Emanuele Trucco, Proceedings of the British Machine Vision Conference, pages, 2011, pp. 46–1.

Li, 2010, Action recognition based on a bag of 3d points, CVPR Workshop, 1, 9

A.M. Loh, The recovery of 3-D structure using visual texture patterns. University of Western Australia, 2006.

B.D. Lucas, T. Kanade, et al., An iterative image registration technique with an application to stereo vision., in: IJCAI, 1981, pp. 674–679.

Lv, 2006, Recognition and segmentation of 3-d human action using hmm and multi-class adaboost, ECCV, 359

S. Malassiotis, N. Aifanti, M. Strintzis, A gesture recognition system using 3d data, in: 3D Data Processing Visualization and Transmission, 2002. Proceedings. First International Symposium on, IEEE, 2002, pp. 190–193.

Masood, 2011, Measuring and reducing observational latency when recognizing actions, ICCV Workshops, IEEE, 422

Matsumoto, 2000, An algorithm for real-time stereo vision implementation of head pose and gaze direction measurement, Autom. Face Gesture Recognit. IEEE, 499, 10.1109/AFGR.2000.840680

R. Muñoz-Salinas, E. Aguirre, M. García-Silvente, People detection and tracking using stereo vision and color. Image and Vision Computing 25, 2007, 995–1007.

Ni, 2011, Rgbd-hudaact: a color-depth video database for human daily activity recognition, ICCV Workshop, 1147

A. Ogale, A. Karapurkar, G. Guerra-Filho, Y. Aloimonos, View-invariant identification of pose sequences for action recognition, in: VACE, Citeseer, 2004.

Oikonomopoulos, 2005, Spatiotemporal salient points for visual recognition of human actions, Syst. Man Cybern. Part B: Cybern. IEEE Trans., 36, 710, 10.1109/TSMCB.2005.861864

Parameswaran, 2006, View invariance for human action recognition, IJCV, 66, 83, 10.1007/s11263-005-3671-4

Poppe, 2010, A survey on vision-based human action recognition, Image Vision Comput., 28, 976, 10.1016/j.imavis.2009.11.014

Prados, 2004, Unifying approaches and removing unrealistic assumptions in shape from shading: mathematics can help, ECCV. Springer, 141

Prados, 2005, Shape from shading: a well-posed problem?, CVPR, IEEE, 870

Pritchett, 1998, Wide baseline stereo matching, ICCV, IEEE, 754

Roh, 2010, View-independent human action recognition with volume motion template on single stereo camera, Pattern Recognit. Lett., 31, 639, 10.1016/j.patrec.2009.11.017

Russakoff, 2002, Head tracking using stereo, Mach. Vis. Appl., 13, 164, 10.1007/s001380100063

Rusu, 2009, Fast point feature histograms (fpfh) for 3d registration, ICRA, 3212

Sabata, 1991, Estimation of motion from a pair of range images: A review, CVGIP: Image Understanding, 54, 309, 10.1016/1049-9660(91)90032-K

Sabata, 1996, Surface correspondence and motion computation from a pair of range images, Comput. Vision Image Understanding, 63, 232, 10.1006/cviu.1996.0017

Saito, 1995, Application of genetic algorithms to stereo matching of images, Pattern Recognit. Lett., 16, 815, 10.1016/0167-8655(95)00048-L

P. Scovanner, S. Ali, M. Shah, A 3-dimensional sift descriptor and its application to action recognition, in: Proceedings of the 15th international conference on Multimedia, ACM, 2007, pp. 357–360.

Seemann, 2004, Head pose estimation using stereo vision for human-robot interaction, Autom. Face Gesture Recognit. IEEE, 626

Sempena, 2011, Human action recognition using dynamic time warping, ICEEI, IEEE, 1

Y. Shen, H. Foroosh, View-invariant action recognition using fundamental ratios, in: CVPR, IEEE, 2008.

J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, A., Blake, Real-time human pose recognition in parts from a single depth image. CVPR, 2011.

Stamos, 2000, 3-d model construction using range and image data, CVPR, IEEE, 531

J. Sung, C. Ponce, B. Selman, A. Saxena, Human activity detection from RGBD images. PAIR, 2011.

Sung, 2012, Unstructured human activity detection from rgbd images, ICRA, IEEE, 842

Swadzba, 2008, Tracking objects in 6d for reconstructing static scenes, CVPRW, IEEE, 1

Turaga, 2008, Machine recognition of human activities: A survey, Circuits Syst. Video Technol. IEEE Trans., 18, 1473, 10.1109/TCSVT.2008.2005594

Uddin, 2011, Human activity recognition using body joint-angle features and hidden markov model, ETRI J., 33, 569, 10.4218/etrij.11.0110.0314

R. Urtasun, P. Fua, 3d tracking for gait characterization and recognition, in: Automatic Face and Gesture Recognition, 2004. Proceedings. Sixth IEEE International Conference on, IEEE, 2004, pp. 17–22.

Vedula, 1999, Three-dimensional scene flow, ICCV, IEEE, 722

Vieira, 2012, Stop: Space-time occupancy patterns for 3d action recognition from depth map sequences, Progress Pattern Recognit. Image Anal. Comput. Vision Appl., 252, 10.1007/978-3-642-33275-3_31

H. Wang, M.M., Ullah, A. Klaser, I. Laptev, C. Schmid, et al., Evaluation of local spatio-temporal features for action recognition, in: BMVC, 2009.

J. Wang, Z. Liu, J. Chorowski, Z. Chen, Y. Wu, Robust 3d action recognition with random occupancy patterns, in: ECCV. Springer, 2012a, pp. 872–885.

Wang, 2012, Mining actionlet ensemble for action recognition with depth cameras, CVPR, 1290

Wedel, 2011, Stereoscopic scene flow computation for 3d motion understanding, IJCV, 95, 29, 10.1007/s11263-010-0404-0

Weinland, 2007, Action recognition from arbitrary views using 3d exemplars, ICCV, IEEE, 1

Weinland, 2010, Making action recognition robust to occlusions and viewpoint changes, ECCV. Springer, 635

Willems, 2008, An efficient dense and scale-invariant spatio-temporal interest point detector, ECCV, 650

C. Wolf, J. Mille, E. Lombardi, O. Celiktutan, M. B. Jiu, E. Dellandrea, C.E., Bichot, C. Garcia, B. Sankur, The liris human activities dataset and the icpr 2012 human activities recognition and localization competition. Technical Report RR-LIRIS-2012-004, LIRIS Laboratory.

Wu, 2012, One shot learning gesture recognition from rgbd images, CVPRW, IEEE, 7

L. Xia, J. Aggarwal, Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera, in: CVPR, IEEE, 2013.

Xia, 2012, View invariant human action recognition using histograms of 3d joints, CVPRW, IEEE, 20

Yang, 1992, Object modelling by registration of multiple range images, Image Vision Comput., 10, 145, 10.1016/0262-8856(92)90066-C

X. Yang, Y. Tian, Eigenjoints-based action recognition using naïve-bayes-nearest-neighbor, in: CVPRW, IEEE, 2012, pp. 14–19.

X. Yang, C. Zhang, Y. Tian, Recognizing actions using depth motion maps-based histograms of oriented gradients, 2012.

A. Yao, J. Gall, G. Fanelli, L. Van Gool, Does human action recognition benefit from pose estimation?, in: BMVC, 2011.

Yao, 2010, A hough transform-based voting framework for action recognition, CVPR, IEEE, 2061

E. Yu, J. Aggarwal, Human action recognition with extremities as semantic posture representation, in: Computer Vision and Pattern Recognition Workshops, 2009. CVPR Workshops 2009. IEEE Computer Society Conference on, IEEE, 2009, pp. 1–8.

Yu, 2008, Robust 3-d motion tracking from stereo images: A model-less method, Instrum. Meas. IEEE Trans., 57, 622, 10.1109/TIM.2007.911641

Yun, 2012, Two-person interaction detection using body-pose features and multiple instance learning, CVPRW, IEEE, 28

C. Zhang, Y. Tian, Rgbd camera-based daily living activity recognition, 2012.

Zhang, 2011, 4-dimensional local spatio-temporal features for human activity recognition, IROS, 2044

Zhao, 2012, Combing rgb and depth map features for human activity recognition, APSIPA ASC, IEEE, 1

Mitiche, Amar and Aggarwal, Jagdishkumar Keshoram, Computer Vision Analysis of Image Motion by Variational Methods, Springer, 2013.