Convolutional neural networks for crowd behaviour analysis: a survey

The Visual Computer - Tập 35 Số 5 - Trang 753-776 - 2019
Gaurav Tripathi1, Kulbir Singh1, Dinesh Kumar Vishwakarma2
1Central Research Lab, Bharat Electronics Ltd., Ghaziabad, India
2Department of Information Technology, Delhi Technological University, Delhi, India

Tóm tắt

Từ khóa


Tài liệu tham khảo

Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

Deng, L.: An overview of deep-structured learning for information processing. In: Asian-Pacific Signal and Information Processing Annual Summit and Conference (APSIPA-ASC), Oct. 2011

Vicsek, T., Zafeiris, A.: Collective motion. Phys. Rep. 517(3), 71–140 (2012)

Hinton, G.: Deep neural networks for acoustic modelling in speech recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012)

Yu, D., Deng, L.: Deep learning and its applications to signal and information processing. IEEE Signal Process. Mag. 28(1), 145–154 (2011)

Arel, I., Rose, C., Karnowski, T.: Deep machine learning—a new frontier in artificial intelligence. IEEE Comput. Intell. Mag. 5(4), 13–18 (2010)

Deng, L.: A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Trans. Signal Inf. Process. 3, e2 (2014)

Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)

Lo, S.-C., Lou, S.-L., Lin, J.-S., Freedman, M.T., Chien, M.V., Mun, S.K.: Artificial convolution neural network techniques and applications for lung nodule detection. IEEE Trans. Med. Imaging 14(4), 711–718 (1995)

Lecun, Y.B.L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE (1998)

Krizhevsky, A., Sutskever, I., Geoffrey, E.H.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS 2012), vol. 25 (2012)

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 1–42 (2014)

Moeslund, T.B., Granum, E.: A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 81(3), 231–268 (2001)

Bishop, C.M.: Pattern Recognition & Machine Learning, vol. 128, 1st edn, pp. 1–58. Springer, New York (2006)

Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36(1), 41–50 (2003)

Lemley, J., Bazrafkan, S., Corcoran, P.: Deep learning for consumer devices and services: pushing the limits for machine learning, artificial intelligence, and computer vision. IEEE Consum. Electron. Mag. 6(2), 48–56 (2017)

Leo, M., Medioni, G., Trivedi, M., Kanade, T., Farinella, G.: Computer vision for assistive technologies. Comput. Vis. Image Underst. 15, 1–15 (2017)

Liu, D., Wang, Z., Nasrabadi, N., Huang, T.: Learning a mixture of deep networks for single image super-resolution. In: Asian Conference on Computer Vision (2017)

Wing, J.M.: Computational thinking. Commun. ACM 49(3), 33–35 (2006)

Sun, Y., Fisher, R.: Object-based visual attention for computer vision. Artif. Intell. 146(1), 77–123 (2003)

Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)

Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G.: Recent advances in convolutional neural networks. eprint arXiv:1512.07108 , Dec. 2015

LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)

Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: International Conference on Neural Information Processing Systems (2007)

Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)

Hubel, D.H., Wiesel, T.N.: Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195(1), 215–243 (1968)

LeCun, Y., Cortes, C., Burges, C.J.: MNIST handwritten digit database (2010)

Gewin, V.: Turning point: intelligence programmer. Nature 533(281), 145–284 (2016)

Clark, C., Storkey, A.: Teaching deep convolutional neural networks to play go. arXiv preprint arXiv:1412.3409 (2014)

Wallach, I., Dzamba, M., Heifets, A.: AtomNet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery. arXiv preprint arXiv:1510.02855 (2015)

Weisstein, E.W.: Convolution. From MathWorld—a Wolfram web resource (2009)

Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: ECCV (2014)

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, Boston (2015)

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. eprint arXiv:1512.03385 (2015)

He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision, Amsterdam (2016)

Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)

Singh, S., Hoiem, D., Forsyth, D.: Swapout: learning an ensemble of deep architectures. arXiv preprint arXiv:1605.06465 (2016)

Targ, S., Almeida, D., Lyman, K.: Resnet in resnet: generalizing residual architectures. arXiv preprint arXiv:1603.08029 (2016)

Zhang, K., Sun, M., Han, T.X., Yuan, X., Guo, L., Liu, T.: Residual networks of residual networks: multilevel residual networks. IEEE Trans. Circuits Syst. Video Technol. (2016). https://doi.org/10.1109/TCSVT.2017.2654543

Ngiam, J., Chen, Z., Chia, D., Koh, P.W., Le, Q.V., Ng, A.Y.: Tiled convolutional neural networks. In: NIPS (2010)

Wang, Z., Oates, T.: Encoding time series as images for visual inspection and classification using tiled convolutional neural networks. In: AAAI Workshop (2015)

Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In ICLR (2016)

Kalchbrenner, N., Espeholt, L., Simonyan, K., Oord, A., Graves, A., Kavukcuoglu, K.: Neural machine translation in linear time. arXiv preprint arXiv:1610.10099 (2016)

Sercu, T., Goel, V.: Dense prediction on sequences with time-dilated convolutions for speech recognition. In: NIPS Workshop (2016)

Oord, V.D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K.: Wavenet: a generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)

Lin, M., Chen, Q., Yan, S.: Network in network. arXiv:1312.4400 (2013)

Szegedy, C., Ioe, S., Vanhoucke, V., Alemi, A.: Inceptionv4, Inception-ResNet and the impact of residual connections on learning. arXiv:1602.07261 (2016)

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. arXiv:1411.4038 (2015)

Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: CVPR (2010)

Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: ICCV (2011)

Bruna, J., Szlam, A., LeCun, Y.: Signal recovery from pooling representations. eprint arXiv:1311.4025 (2014)

Gulcehre, C., Cho, K., Pascanu, R., Bengio, Y.: Learned-norm pooling for deep feedforward and recurrent neural networks. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases (2014)

Simoncelli, E.P., Heeger, D.J.: A model of neuronal responses in visual area MT. Vis. Res. 38(5), 743–761 (1998)

Hyvärinen, A., Köster, U.: Complex cell pooling and the statistics of natural images. Netw. Comput. Neural Syst. 18(2), 81–100 (2007)

Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation SOF feature detectors. eprint arXiv:1207.0580 (2012)

Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: PMLR (2013)

Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. eprint arXiv:1301.3557 (2013)

Rippel, O., Snoek, J., Adams, R.P.: Spectral representations for convolutional neural networks. In: NIPS, Montreal (2015)

Gong, Y., Ke, Q., Isard, M., Lazebnik, S.: A multi-view embedding space for modeling internet images, tags, and their semantics. Int. J. Comput. Vis. 106(2), 210–233 (2014)

Jégou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1704–1716 (2012)

Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: International Conference on International Conference on Machine Learning, Haifa (2010)

Maas, A.L., Hannun, Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing (2013)

Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: International Conference on Machine Learning, Atlanta (2013)

Springenberg, J.T., Riedmiller, M.: Improving deep neural networks with probabilistic maxout units. arXiv preprint arXiv:1312.6116 (2013)

He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification, In: IEEE International Conference on Computer Vision (2015)

Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853 (2015)

Clevert, D.-A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)

Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: International Conference on Computational Statistics (COMPSTAT’2010) (2010)

Wijnhoven, R.G., dde With, P.H.N: Fast training of object detection using stochastic gradient descent. In: 2010 20th International Conference on Pattern Recognition (ICPR). IEEE (2010)

Zinkevich, M.A., Weimer, M., Smola, A., Li, L.: Parallelized stochastic gradient descent. In: NIPS, Vancouver (2010)

Recht, B., Re, C., Wright, S., Niu, F.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: NIPS (2011)

Bengio, Y.: Deep learning of representations: looking forward. In: International Conference on Statistical Language and Speech Processing (2013)

Dean, G., Corrado, G.S., Monga, R., Chen, K., Devin, M., Le, Q.V., Mao, M.Z., Ranzato, M., Senior, A., Tucker, P., Yang, K., Ng, A.Y.: Large scale distributed deep networks. In: NIPS. Lake Tahoe, Nevada (2012)

Zhuang, Y., Chin, W.-S., Juan, Y.-C., Lin, C.-J.: A fast parallel SGD for matrix factorization in shared memory systems. In: ACM Conference on Recommender Systems, Hong Kong (2013)

Thoma, M.: Analysis and optimization of convolutional neural network architectures. arXiv preprint arXiv:1707.09725 (2017)

Ooi, B.C., et al.: SINGA: a distributed deep learning platform. In: ACM International Conference on Multimedia, Brisbane, (2015)

Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM International Conference on Multimedia, Orlando (2014)

http://deeplearning4j.org/ . Last visited 27 May 2017

King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)

Seide, F.: Keynote: the computer science behind the Microsoft Cognitive Toolkit: An open source large-scale deep learning toolkit for Windows and Linux. In: IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (2017)

Chen, T., et al.: Mxnet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015)

Lopez, R.: Open NN: an open source neural networks C++ library [software] (2014)

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Ghemawat, S.: TensorFlow: large scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)

Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I., Bergeron, A., Bouchard, N., Warde-Farley, D., Bengio, Y.: Theano: new features and speed improvements. In: Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop (2012)

Collobert, K.K.C.F.R.: Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop (No. EPFL-CONF-192376) (2011)

Wu, S., Moore, B.E., Shah, M.: Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, San Francisco (2010)

Zitouni, M.S., Bhaskar, H., Dias, J., Al-Mualla, M.: Advances and trends in visual crowd analysis: a systematic survey and evaluation of crowd modelling techniques. Neurocomputing 186, 139–159 (2016)

Rodriguez, M., Laptev, I., Sivic, J., Audibert, J.Y.: Density-aware person detection and tracking in crowds. In: IEEE International Conference on Computer Vision (2011)

Xu, D., Song, R., Wu, X., Li, N., Feng, W., Qian, H.: Video anomaly detection based on a hierarchical activity discovery within spatio-temporal context. Neurocomputing 143, 144–152 (2014)

Cheng, Z., Qin, L., Huang, Q., Yan, S., Tian, Q.: Recognizing human group action by layered model with multiple cues. Neurocomputing 136, 124–135 (2014)

Liang, R., Zhu, Y., Wang, H.: Counting crowd flow based on feature points. Neurocomputing 133, 377–384 (2014)

Zhan, B., Monekosso, D.N., Remagnino, P., Velastin, S.A., Xu, L.-Q.: Crowd analysis: a survey. Mach. Vis. Appl. Mach. Vis. Appl. 19(5–6), 345–357 (2008)

Rodrigues, F., Lourenco, M., Ribeiro, B., Pereira, F.: Learning supervised topic models for classification and regression from crowds. IEEE Trans. Pattern Anal. Mach. Intell. 99, 1–1 (2017)

Ali, S., Shah, M.: A Lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, pp. 1–6 (2007)

McIvor, A.M.: Background subtraction techniques, image and vision computing. Proc. Image Vis. Comput. 4, 3099–3104 (2000)

Black, M.J., Fleet, D.J.: Probabilistic detection and tracking of motion bound-aries. Int. J. Comput. Vis. 38(3), 231–245 (2000)

Garcia-Bunster, G., Torres-Torriti, M., Oberli, C.: Crowded pedestrian counting at bus stops from perspective transformations of foreground areas. IET Comput. Vis. 6(4), 296–305 (2012)

Chen, D.Y., Huang, P.C.: Visual-based human crowds behavior analysis based on graph modeling and matching. IEEE Sens. J. 13(6), 2129–2138 (2013)

Stauffer, C., Grimson, W.E.L.W.: Adaptive background mixture models for real-time tracking. In: IEEE Conference Computer Vision and Pattern Recognition (1999)

Chan, A.B., Vasconcelos, N.: Modeling, clustering, and segmenting video with mixtures of dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 909–926 (2008)

Junior, J.C.S.J., Musse, S.R., Jung, C.R.: Crowd analysis using computer vision techniques. IEEE Signal Process. Mag. 27(5), 66–77 (2010)

http://www.desibrandstrategy.com/why-tirupati-tirumala-needs-smarter-analytics/ . Accessed 17 June 2017

http://l7.alamy.com/zooms/dab050fbd7424ff597ca74599e8eb7f9/holi-celebration-in-dauji-temple-dauji-uttar-pradesh-india-asia-d2r9h0.jpg . Accessed 17 June 2017

https://cdn.theatlantic.com/assets/media/img/photo/2011/03/holi-the-festival-of-colors-2011/h15_19113087/main_900.jpg?1420521857 . Accessed 17 June 2017

https://en.wikipedia.org/wiki/List_of_human_stampedes_in_Hindu_temples . Accessed 30 Dec 2017

http://edition.cnn.com/2017/05/22/europe/manchester-arena-incident/ . Accessed 23 May 2017

http://www.dailynewsegypt.com/2015/02/09/28-football-fans-killed-deliberate-massacre-ultras/ . Accessed 9 Feb 2015

http://robertchaen.com/2015/01/01/7935/ . Accessed 1 Jan 2015

https://en.wikipedia.org/wiki/List_of_terrorist_incidents_in_India . Accessed 11 Feb 2018

Dimokranitou, A., Tsechpenakis, G.: Adversarial autoencoders for anomalous event detection in images. Thesis, Purdue University (2017)

Saxena, S.: Crowd behavior recognition for video surveillance. In: International Conference on Advanced Concepts for Intelligent Vision Systems (2008)

Husni, M., Suryana, N.: Crowd event detection in computer vision. In: International Conference on Signal Processing Systems (ICSPS) (2010)

Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

Rodriguez, M., Ali, S., Kanade, T.: Tracking in unstructured crowded scenes. In: International Conference on Computer Vision (2009)

Ozturk, O., Yamasaki, T., Aizawa, K.: Detecting dominant motion flows in unstructured/structured crowd scenes. In: International Conference on Pattern Recognition (ICPR), Istanbul (2010)

Sjarif, N.N.A., Shamsuddin, S.M., Hashim, S.Z.: Detection of abnormal behaviors in crowd scene: a review. Int. J. Adv. Soft Comput. Appl. 4(1), 1–33 (2012)

Yu, H., Zhou, Y., Simmons, J., Przybyla, C.P., Lin, Y., Fan, X., Mi, Y., Wang, S.: Groupwise tracking of crowded similar-appearance targets from low-continuity image sequences. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

Ihaddadene, N., Djeraba, C.: Real-time crowd motion analysis. In: International Conference on Pattern Recognition, Tampa (2008)

Johansson, A., Helbing, D., Al-Abideen, H.Z., Al-Bosta, S.: From crowd dynamics to crowd safety: a video-based analysis. Adv. Complex Syst. 11(4), 497–527 (2008)

Cao, T., Wu, X., Guo, J., Yu, S., Xu, Y.: Abnormal crowd motion analysis. In: IEEE International Conference on Robotics and Biomimetics (ROBIO), Guilin (2009)

Hassner, T., Itcher, Y., Kliper-Gross, O.: Violent flows: Real-time detection of violent crowd behavior. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, Providence (2012)

Krausz, B., Bauckhage, C.: Automatic detection of dangerous motion behavior in human crowds. In: IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS), Klagenfurt (2011)

Liao, H., Xiang, J., Sun, W., Feng, Q., Dai, J.: An abnormal event recognition in crowd scene. In: International Conference on Image and Graphics (ICIG), Hefei (2011)

Wang, B., Ye, M., Li, X., Zhao, F., Ding, J.: Abnormal crowd behavior detection using high-frequency and spatio-temporal features. Mach. Vis. Appl. 23(3), 501–511 (2012)

Andersson, M., Gustafsson, F., St-Laurent, L., Prevost, D.: Recognition of anomalous motion patterns in urban surveillance. IEEE J. Sel. Top. Signal Process. 7(1), 102–110 (2013)

Cho, S.-H., Kang, H.-B.: Abnormal behavior detection using hybrid agents in crowded scenes. Pattern Recognit. Lett. 44, 64–70 (2014)

Candamo, J., Shreve, M., Goldgof, D.B., Sapper, D.B., Kasturi, R.: Understanding transit scenes: a survey on human behavior-recognition algorithms. IEEE Trans. Intell. Transp. Syst. 11(1), 206–224 (2010)

Ge, W., Collins, R.T., Ruback, R.B.: Vision-based analysis of small groups in pedestrian crowds. IEEE Trans. Pattern Anal. Mach. Intell. 34(5), 1003–1016 (2012)

Solmaz, B., Moore, B.E., Shah, M.: Identifying behaviors in crowd scenes using stability analysis for dynamical systems. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 2064–2070 (2012)

Krausz, B., Bauckhage, C.: Loveparade 2010: automatic video analysis of a crowd disaster. Comput. Vis. Image Underst. 116(3), 307–319 (2012)

Ge, W., Collins, R.T., Ruback, B.: Automatically detecting the small group structure of a crowd. In: Workshop on Applications of Computer Vision (WACV), Snowbird (2009)

Dee, H.M., Caplier, A.: Crowd behaviour analysis using histograms of motion direction. In: International Conference on Image Processing (ICIP), Hong Kong (2010)

Subburaman, V.B., Descamps, A., Carincotte, C.: Counting people in the crowd using a generic head detector. In: International Conference on Advanced Video and Signal-Based Surveillance (AVSS), Beijing (2012)

Loy, C.C., Chen, K., Gong, S., Xiang, T.: Crowd counting and profiling: methodology and evaluation. In: Ali, S., Nishino, K., Manocha, D., Shah, M. (eds.) Modeling, Simulation and Visual Analysis of Crowds: A Multidisciplinary Perspective, pp. 347–382. Springer, New York (2013)

Ullah, H., Conci, N.: Crowd motion segmentation and anomaly detection via multi-label optimization. In: ICPR Workshop on Pattern Recognition and Crowd Analysis (2012)

Krisp, J.M., Peters, S., Burkert, F.: Visualizing crowd movement patterns using a directed kernel density estimation. In: Earth Observation of Global Changes (EOGC), pp. 255–268. Springer, Berlin (2013)

Ullah, H., Conci, N.: Structured learning for crowd motion segmentation. In: International Conference on Image Processing (ICIP), Melbourne (2013)

Ullah, H., Ullah, M., Conci, N.: Dominant motion analysis in regular and irregular crowd scenes. In: International Workshop on Human Behavior Understanding (2014)

Li, W., Wu, X., Matsumoto, K., Zhao, H.-A.: Crowd density estimation: an improved approach. In: International Conference on Signal Processing (ICSP), Beijing (2010)

Hsu, W.-L., Lin, K.-F., Tsai, C.-L.: Crowd density estimation based on frequency analysis. In: International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), Dalian (2011)

Zhang, Z., Li, M.: Crowd density estimation based on statistical analysis of local intra-crowd motions for public area surveillance. Opt. Eng. 51(4), 047204 (2012)

Zhou, B., Zhang, F., Peng, L.: Higher-order SVD analysis for crowd density estimation. Comput. Vis. Image Underst. 116(9), 1014–1021 (2012)

Idrees, H., Soomro, K., Shah, M.: Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning. IEEE Trans. Pattern Anal. Mach. Intell. 37(10), 1986–1998 (2015)

Rao, A.S., Gubbi, J., Marusic, S., Palaniswami, M.: Estimation of crowd density by clustering motion cues. Vis. Comput. 31(11), 1533–1552 (2015)

Wang, L., Hu, W., Tan, T.: Recent developments in human motion analysis. Pattern Recognit. 36(3), 585–601 (2003)

Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Trans. Syst. Man Cybern. 34(3), 334–352 (2004)

Sodemann, A.A., Ross, M.P., Borghetti, B.J.: A review of anomaly detection in automated surveillance. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(6), 1257–1272 (2012)

Gowsikhaa, D., Abirami, S., Baskaran, R.: Automated human behavior analysis from surveillance videos: a survey. Artif. Intell. Rev. 42(4), 747–765 (2014)

Thida, M., Yong, Y., Climent-Pérez, P., Eng, H., Remagnino, P.: A literature review on video analytics of crowded scenes. In: Atrey, P., Kankanhalli, M., Cavallaro, A. (eds.) Intelligent Multimedia Surveillance, pp. 17–36. Springer, Berlin (2013)

Jo, H., Chug, K., Sethi, R.: A review of physics-based methods for group and crowd analysis in computer vision. J. Postdr. Res. 1(1), 4–7 (2013)

Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. Vis. Comput. 29(10), 983–1009 (2013)

Afsar, P., Cortez, P., Santos, H.: Automatic visual detection of human behavior: a review from 2000 to 2014. Expert Syst. Appl. 42(20), 6935–6956 (2015)

Li, T., Chang, H., Wang, M., Ni, B., Hong, R., Yan, S.: Crowded scene analysis: a survey. IEEE Trans. Circuits Syst. Video Technol. 25(3), 367–386 (2015)

Kok, V.J., Lim, M.K., Chan, C.S.: Crowd behavior analysis: a review where physics meets biology. Neurocomputing 177, 342–362 (2016)

Grant, J.M., Flynn, P.J.: Crowd scene understanding from video: a survey. ACM Trans. Multimed. Comput. Commun. Appl. 13(2), 1–23 (2017)

Hughes, R.L.: The flow of human crowds. Annu. Rev. Fluid Mech. 35(1), 169–182 (2003)

Leggett, R.: Real-time crowd simulation: a review. http://www.leggettnet.org.uk/docs/crowdsimulation.pdf (2004). Accessed 19 Jan 2015 (2004)

Fisher, L.: The Perfect Swarm: The Science of Complexity in Everyday Life. Basic Books, New York (2009)

Moore, B.E., Ali, S., Mehran, R., Shah, M.: Visual crowd surveillance through a hydrodynamics lens. Commun. ACM 54(12), 64–73 (2011)

Shao, J., Loy, C.C., Kang, K., Wang, X.: Crowded scene understanding by deeply learned volumetric slices. IEEE Trans. Circuits Syst. Video Technol. 27(3), 613–623 (2017)

Andrearczyk, V., Whelan, P.F.: Convolutional neural network on three orthogonal planes for dynamic texture classification. arXiv preprint arXiv:1703.05530 (2017)

Sabokrou, M., Fayyaz, M., Fathy, M., Klette, R.: Deep-cascade: cascading 3D deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans. Image Process. 26(4), 1992–2004 (2017)

Kumagai, S., Hotta, K., Kurita, T.: Mixture of counting CNNs: adaptive integration of CNNs specialized to specific appearance for crowd counting. arXiv preprint arXiv:1703.09393 (2017)

Zeng, L., Xu, X., Cai, B., Qiu, S., Zhang, T.: Multi-scale Convolutional Neural Networks for Crowd Counting. arXiv preprint arXiv:1702.02359 (2017)

Zhuang, N., Yusufu, T., Ye, J., Hua, K.A.: Group activity recognition with differential recurrent convolutional neural networks. In: International Conference on Automatic Face & Gesture Recognition (FG 2017) (2017)

Ahsan, U., Sun, C., Hays, J., Essa, I.: Complex event recognition from images with few training examples. In: IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa (2017)

Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, Boston (2015)

Kang, K., Wang, X.: Fully convolutional neural networks for crowd segmentation. arXiv preprint arXiv:1411.4464 (2014)

Yun, S., Yun, K., Choi, J., Choi, J.Y.: Density-aware pedestrian proposal networks for robust people detection in crowded scenes. In: European Conference on Computer Vision (2016)

Walach, E., Wolf, L.: Learning to count with CNN boosting. In: European Conference on Computer Vision (2016)

Tieleman, T., Hinton, G.: Lecture 6.5–RmsProp: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 4, 26–31 (2012)

Carvalho, J., Marques, M., Costeira, J.P.: Understanding people flow in transportation hubs. arXiv preprint arXiv:1705.00027 (2017)

Boominathan, L., Kruthiventi, S.S., Babu, R.V.: CrowdNet: a deep convolutional network for dense crowd counting. In: ACM on Multimedia Conference, Amsterdam (2016)

Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deep Lab: semantic image segmentation with deep convolutional nets and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)

Onoro-Rubio, D., López-Sastre, R.J.: Towards perspective-free object counting with deep learning. In: European Conference on Computer Vision (2016)

Kang, D., Dhar, D., Chan, A.B.: Crowd counting by adapting convolutional neural networks with side information. arXiv preprint arXiv:1611.06748 (2016)

Marsden, M., McGuinness, K., Little, S., O’Connor, N.E.: Fully convolutional crowd counting on highly congested scenes. arXiv preprint arXiv:1612.00220 (2016)

Zhao, Z., Li, H., Zhao, R., Wang, X.: Crossing-line crowd counting with two-phase deep neural networks. In: European Conference on Computer Vision (2016)

Sourtzinos, P., Velastin, S.A., Jara, M., Zegers, P., Makris, D.: People counting in videos by fusing temporal cues from spatial context-aware convolutional neural networks. In: European Conference on Computer Vision (2016)

Chattopadhyay, P., Vedantam, R., Selvaraju, R.R., Batra, D., Parikh, D.: Counting everyday objects in everyday scenes. arXiv preprint arXiv:1604.03505 (2016)

Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Sheng, B., Shen, C., Lin, G., Li, J., Yang, W., Sun, C.: Crowd counting via weighted VLAD on dense attribute feature maps. IEEE Trans. Circuits Syst. Video Technol. 99, 1–1 (2016)

Yi, S.: Pedestrian Behavior Modeling and Understanding in Crowds. Thesis, The Chinese University of Hong Kong, Hong Kong (2016)

Cao, L., Zhang, X., Ren, W., Huang, K.: Large scale crowd analysis based on convolutional neural network. Pattern Recognit. 48(10), 3016–3024 (2015)

Hu, Y., Chang, H., Nian, F., Wang, Y., Li, T.: Dense crowd counting from still images with convolutional neural networks. J. Vis. Commun. Image Represent. 38, 530–539 (2016)

Shao, J., Loy, C.C., Kang, K., Wang, X.: Slicing convolutional neural network for crowd video understanding. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas (2016)

Burney, A., Syed, T.Q.: Crowd video classification using convolutional neural networks. In: International Conference on Frontiers of Information Technology (FIT), Islamabad (2016)

Ravanbakhsh, M., Nabi, M., Mousavi, H., Sangineto, E., Sebe, N.: Plug-and-play cnn for crowd motion analysis: an application in abnormal event detection. arXiv preprint arXiv:1610.00307 (2016)

Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas (2016)

Wang, T., Li, G., Lei, J., Li, S., Xu, S.: Crowd counting based on MMCNN in still images. In: Scandinavian Conference on Image Analysis (2017)

Sabokrou, M., Fayyaz, M., Fathy, M., Moayedd, Z., Klette, R.: Fully convolutional neural network for fast anomaly detection in crowded scenes. arXiv preprint arXiv:1609.00866 (2016)

Fu, M., Xua, P., Lia, X., Liua, Q., Yea, M., Zhu, C.: Fast crowd density estimation with convolutional neural networks. Eng. Appl. Artif. Intell. 43, 81–88 (2015)

Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: IEEE conference on Computer Vision and Pattern Recognition, Columbus (2014)

Wang, C., Zhang, H., Yang, L., Liu, S., Cao, X.: Deep people counting in extremely dense crowds. In: Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia (2015)

Sindagi, V.A., Patel, V.M.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (2017)

Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: IEEE International Conference on Computer Vision, Venice, Italy (2017)

Xiong, F., Shi, X., Yeung, D.-Y.: Spatiotemporal modeling for crowd counting in videos. arXiv preprint arXiv:1707.07890 (2017)

Liu, B., Vasconcelos, N.: Bayesian model adaptation for crowd counts. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

Sindagi, V.A., Patel, V.M.: A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recognit. Lett. (2017). https://doi.org/10.1016/j.patrec.2017.07.007

Pham, V.-Q., Kozakaya, T., Yamaguchi, O., Okada, R.: Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

Shao, J., Kang, K., Loy, C.C., Wang, X.: Deeply learned attributes for crowded scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–971 (2016)

Gutoski, M., Aquino, N.M.R., Ribeiro, M., Lazzaretti, E.A., Lopes, S.H.: Detection of video anomalies using convolutional autoencoders and one-class support vector machines. http://cbic2017.org/

Feng, Y., Yuan, Y., Lu, X.: Learning deep event models for crowd anomaly detection. Neurocomputing 219, 548–556 (2017)

Zhou, S., Shen, W., Zeng, D., Fang, M., Wei, Y., Zhang, Z.: Spatial-temporal convolutional neural networks for anomaly detection and localization in crowded scenes. Sig. Process. Image Commun. 47, 358–368 (2016)

Smeureanu, S., Ionescu, R.T., Popescu, M., Alexe, B.: Deep appearance features for abnormal behavior detection in video. In: Image Analysis and Processing—ICIAP 2017, Catania, Italy (2017)

Sun, J., Shao, J., He, C.: Abnormal event detection for video surveillance using deep one-class learning. Multimed. Tools Appl. (2017). https://doi.org/10.1007/s11042-017-5244-2

Hinami, R., Mei, T., Satoh, S.: Joint detection and recounting of abnormal events by learning deep generic knowledge. arXiv preprint arXiv:1709.09121 (2017)

Péteri, R., Fazekas, S., Huiskes, M.J.: Dyntex: a comprehensive database of dynamic textures. Pattern Recognit. Lett. 31(12), 1627–1632 (2010)

Doretto, G., Chiuso, A., Wu, Y.N., Soatto, S.: Dynamic textures. Int. J. Comput. Vis. 51(2), 91–109 (2003)

Ghanem, B., Ahuja, N.: Maximum margin distance learning for dynamic texture recognition. In: Computer Vision–ECCV 2010 (2010)

Chan, A.B., Sheng John, Z., Vasconcelos, L.N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on IEEE Computer Vision and Pattern Recognition, 2008, CVPR 2008 (2008)

Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: BMVC (2012)

Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)

Blunsden, S., Fisher, R.B.: The BEHAVE video dataset: ground truthed video for multi-person behavior classification. Ann. BMVA 4, 1–12 (2010)

Papadopoulos, S., Schinas, E., Mezaris, V., Troncy, R., Kompatsiaris, I.: Social event detection at mediaeval 2012: challenges, dataset and evaluation. In: Proceedings of MediaEval 2012 Workshop (2012)

Li, L., Su, H., Xing, E., Fei-Fei, L.: Object bank: a high-level image representation for scene classification and semantic feature sparsification. In: NIPS (2010)

Everingham, M., Eslami, S.M.A., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)

Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: ECCV, pp. 740–755 (2014)

Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2012)

Shang, C., Ai, H., Bai, B.: End-to-end crowd counting via joint learning local and global count. In: 2016 IEEE International Conference on Image Processing (ICIP), Phoenix (2016)

Conigliaro, D., Rota, P., Setti, F., Bassetti, C., Conci, N., Sebe, N., Cristani, M.: The S-Hock Dataset: analyzing crowds at the stadium. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

Shao, J., Change Loy, C., Wang, X.: Scene-independent group profiling in crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)

Wu, S., Yang, H., Zheng, S., Su, H., Fan, Y., Yang, M.-H.: Crowd behavior analysis via curl and divergence of motion trajectories. Int. J. Comput. Vis. 123(3), 499–519 (2017)

Yoo, Y., Yun, K., Yun, S., Hong, J., Jeong, H., Young Choi, J.: Visual path prediction in complex scenes with crowded moving objects. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)