Anomaly Detection on Public Streets Using Spatial Features and a Bidirectional Sequential Classifier
Tóm tắt
The anomaly detection problem consists in identifying the events that do not conform to an expected behavior pattern. In law enforcement and security, detection of anomalous events has application in the identification of suspicious behaviors. This paper addresses such problem in public areas by monitoring surveillance videos. Our approach involves a convolutional neural network for spatial features extraction, followed by a time series classifier with a one-dimensional convolutional layer and an ensemble of stacked bidirectional recurrent networks. The proposed methodology selects a pre-trained convolutional architecture for the spatial feature and applies transfer learning to specialize this architecture in anomaly detection in surveillance videos. We performed the experiments on the UCSD Anomaly Detection Dataset and the CUHK Avenue Dataset for Abnormal Event Detection to compare our approach with other works. Our evaluation protocol uses the Area Under the Receiver Operating Characteristic Curve—AUC, the Equal Error Rate—EER, and the Area Under the Precision vs. Recall Curve—AUPRC. During the experiments, the model obtained AUC above
$$92\%$$
and EER below
$$15\%$$
, which are compatible with the current literature.
Tài liệu tham khảo
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, GS., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O,. Warden, P., Wattenberg, M., Wicke, M., Yu, Y., & Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/, software available from tensorflow.org.
Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258.
Chu, W., Xue, H., Yao, C., & Cai, D. (2019). Sparse coding guided spatiotemporal feature learning for abnormal event detection in large videos. IEEE Transactions on Multimedia, 21(1), 246–255.
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR09.
Feng, Y., Yuan, Y., & Lu, X. (2017). Learning deep event models for crowd anomaly detection. Neurocomputing, 219, 548–556.
Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks, 18(5–6), 602–610.
Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Río, J. F., Wiebe, M., Peterson, P., … Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. https://doi.org/10.1038/s41586-020-2649-2
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778.
Hendel, A., Weinshall, D., & Peleg, S. (2012). Identifying surprising events in video using bayesian topic models. In: Detection and Identification of Rare Audiovisual Cues, Springer, pp 97–105.
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708.
Khan, M. U. K., Park, H., & Kyung, C. (2019). Rejecting motion outliers for efficient crowd anomaly detection. IEEE Transactions on Information Forensics and Security, 14(2), 541-556.
Lever, J., Krzywinski, M., & Altman, N. (2016). Points of significance: classification evaluation.
Li, W., Mahadevan, V., & Vasconcelos, N. (2014). Anomaly detection and localization in crowded scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(1):18–32, data retrieved from Statistical Visual Computing Laboratory - University of California - San Diego, http://www.svcl.ucsd.edu/projects/anomaly/
Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollar, P. (2017). Focal loss for dense object detection. In: The IEEE International Conference on Computer Vision (ICCV).
Liu, W., Luo, W., Li, Z., Zhao, P., & Gao, S., et al. (2019). Margin learning embedded prediction for video anomaly detection with a few anomalies. In: IJCAI, pp 3023–3030.
Lu, C., Shi, J., & Jia, J.: (2013) Abnormal event detection at 150 fps in matlab. In: Proceedings of the IEEE international conference on computer vision, pp 2720–2727.
McKinney, W. (2010). Data Structures for Statistical Computing in Python. In: Stéfan van der Walt, Jarrod Millman (eds) Proceedings of the 9th Python in Science Conference, pp 56 – 61, https://doi.org/10.25080/Majora-92bf1922-00a
Montemayor, A. S., Pantrigo, J. J., & Salgado, L. (2015). Special issue on real-time computer vision in smart cities. Journal of Real-Time Image Processing, 10(4), 723–724.
Ravanbakhsh, M., Sangineto, E., Nabi, M., & Sebe, N. (2019). Training adversarial discriminators for cross-channel abnormal event detection in crowds. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, pp 1896–1904
Scheirer, W. J., de Rezende, R. A., Sapkota, A., & Boult, T. E. (2012). Toward open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(7), 1757–1772.
Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673–2681.
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626.
Simonyan, K., & Zisserman, A. (2014a). Two-stream convolutional networks for action recognition in videos. arXiv preprint arXiv:1406.2199
Simonyan, K., & Zisserman, A. (2014b). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Singh, K., Rajora, S., Vishwakarma, D. K., Tripathi, G., Kumar, S., & Walia, G. S. (2020). Crowd anomaly detection using aggregation of ensembles of fine-tuned convnets. Neurocomputing, 371, 188–198.
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. CoRR abs/1409.3215.
Xu, D., Ricci, E., Yan, Y., Song, J., & Sebe, N. (2015). Learning deep representations of appearance and motion for anomalous event detection. arXiv preprint arXiv:1510.01553
Zhou, J. T., Du, J., Zhu, H., Peng, X., Liu, Y., & Goh, R. S. M. (2019). Anomalynet: an anomaly detection network for video surveillance. IEEE Transactions on Information Forensics and Security, 14(10), 2537–2550.