Multipath affinage stacked—hourglass networks for human pose estimation

Guoguang Hua1, Lihong Li1, Shiguang Liu2
1School of Information and Electrical Engineering, Hebei University of Engineering, Handan, China
2School of Computer Science and Technology, Division of Intelligence and Computing, Tianjin University, Tianjin, China

Tóm tắt

Từ khóa


Tài liệu tham khảo

Chen K, Ding G, Han J. Attribute-based supervised deep learning model for action recognition. Frontiers of Computer Science, 2017, 11(2): 219–229

Varior R R, Shuai B, Lu J. A siamese long short-term memory architecture for human re-identification. In: Proceedings of European Conference on Computer Vision. 2016, 135–153

Sapp B, Taskar B. MODEC: multimodal decomposable models for human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3674–3681

Felzenszwalb P, Mcallester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2008

Pishchulin L, Andriluka M, Gehler P. Strong appearance and expressive spatial models for human pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision. 2014, 3487–3494

Johnson S, Everingham M. Learning effective human pose estimation from inaccurate annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2011, 1465–1472

Ouyang W, Chu X, Wang X. Multi-source deep learning for human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 2329–2336

Ladicky L, Torr P H S, Zisserman A. Human pose estimation using a joint pixel-wise and part-wise formulation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3578–3585

Liu S G, Li Y, Hua G. Human pose estimation in video via structured space learning and halfway temporal evaluation. IEEE Transactions on Circuits and Systems for Video Technology. 2018, 1

Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. 2012, 1097–1105

Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of International Conference on Machine Learning. 2015, 448–456

Szegedy C, Liu W, Jia Y. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1–9

Li Y, Liu S G. Temporal-coherency-aware human pose estimation in video via pre-trained res-net and flow-CNN. In: Proceedings of International Conference on Computer Animation and Social Agents. 2017, 150–159

Johnson S, Everingham M. Clustered pose and nonlinear appearance models for human pose estimation. In: Proceedings of the British Machine Vision Conference. 2010, 1–11

Andriluka M, Pishchulin L, Gehler P. 2D Human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3686–3693

Newell A, Yang K, Deng J. Stacked hourglass networks forhuman pose estimation. In: Proceedings of European Conference on Computer Vision. 2016, 483–499

Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 3431–3440

Andriluka M, Roth S, Schiele B. Pictorial structures revisited: people detection and articulated pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2009, 1014–1021

Andriluka M, Roth S, Schiele B. Monocular 3D pose estimation and tracking by detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2010, 623–630

Lopez Q, Manuel I. Mixing body-parts model for 2D human pose estimation in stereo videos. IET Computer Vision, 2017, 11(6): 426–433

Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2005, 886–893

Dogan E, Eren G, Wolf C. Multi-view pose estimation with mixtures-of-parts and adaptive viewpoint selection. IET Computer Vision, 2018, 12(4): 403–411

Toshev A, Szegedy C. DeepPose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1653–1660

Tompson J, Goroshin R, Jain A. Efficient object localization using convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 648–656

Tompson J, Jain A, LeCun Y. Joint training of a convolutional network and a graphical model for human pose estimation. In: Proceedings of the 28th Annual Conference on Neural Information Processing Systems. 2014, 1799–1807

Carreira J, Agrawal P, Fragkiadaki K. Human pose estimation with iterative error feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4733–4742

Wei S E, Ramakrishna V, Kanade T. Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4724–4732

Cao Z, Simon T, ShihEn W. Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 1302–1310

Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 1520–1528

Rematas K, Ritschel T, Fritz M. Deep reflectance maps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4508–s4516

He K M, Zhang X, Ren S. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778

Jaderberg M, Simonyan K, Zisserman A. Spatial transformer networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015, 2017–2025

Ferrari V, Marin M, Zisserman A. Progressive search space reduction for human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1–8

Yang W, Li S, Ouyang W. Learning feature pyramids for human pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 1281–1290

Yang Y, Ramanan D. Articulated human detection with flexible mixtures of parts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12): 2878–2890

Yu X, Zhou F, Chandraker M. Deep deformation network for object landmark localization. In: Proceedings of European Conference on Computer Vision. 2016, 52–70

Belagiannis V, Zisserman A. Recurrent human pose estimation. In: Proceedings of the International Conference on Automatic Face and Gesture Recognition. 2017, 468–475

Lifshitz I, Fetaya E, Ullman S. Human pose estimation using deep consensus voting. In: Proceedings of European Conference on Computer Vision. 2016, 246–260

Pishchulin L, Insafutdinov E, Tang S. Deepcut: joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 4929–4937

Insafutdinov E, Pishchulin L, Andres B. Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 34–50

Hu P, Ramanan D. Bottom-up and top-down reasoning with hierarchical rectified gaussians. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 5600–5609