Bộ phát hiện đa hộp đơn Inception với cụm hóa lan tỏa và ứng dụng của chúng trong việc đếm phương tiện đa lớp

Springer Science and Business Media LLC - Tập 51 - Trang 4714-4729 - 2021
P. M. Harikrishnan1, Anju Thomas1, Varun P. Gopi1, P. Palanisamy1, Khan A. Wahid2
1Department of Electronics and Communication Engineering, National Institute of Technology Tiruchirappalli, Tamil Nadu, India
2Department of Electrical and Computer Engineering, University of Saskatchewan, Saskatchewan, Canada

Tóm tắt

Việc phát hiện và đếm phương tiện đa lớp trong các hệ thống giám sát giao thông dựa trên video với hiệu suất thời gian thực và độ chính xác chấp nhận được là một thách thức. Bài báo này đề xuất một mạng nơ-ron tích chập đa hộp đơn đã được điều chỉnh tên là Inception-SSD (ISSD) để phát hiện phương tiện và một thuật toán khớp trọng tâm để đếm phương tiện. Một khối giống như Inception được giới thiệu để thay thế các lớp đặc trưng bổ sung trong SSD gốc nhằm xử lý việc phát hiện phương tiện đa tỷ lệ, nhằm cải thiện khả năng phát hiện các phương tiện nhỏ hơn. Phương pháp không tối đa (Non-Maximum Suppression - NMS) được thay thế bằng cụm hóa lan tỏa (Affinity Propagation Clustering - APC) để cải thiện việc phát hiện các phương tiện bị che khuất gần nhau. Đối với hình ảnh đầu vào 300 × 300, trên tập dữ liệu thử nghiệm PASCAL VOC 2007, ISSD đề xuất đạt được độ chính xác trung bình (mean Average Precision - mAP) là 79.3 và chạy trên NVIDIA RTX2080Ti; mạng đạt tốc độ 52.3 khung hình mỗi giây. ISSD với APC tạo ra cải thiện 2.7% về mAP so với SSD300 gốc trong khi vẫn giữ được hiệu quả thời gian. Bằng cách sử dụng thuật toán khớp trọng tâm, các phương tiện được đếm theo lớp với chỉ số F1 có trọng số là 98.5%, vượt trội hơn khá nhiều so với các công trình nghiên cứu gần đây khác.

Từ khóa

#phát hiện phương tiện #đếm phương tiện #mạng nơ-ron tích chập #cụm hóa lan tỏa #học sâu #SSD #hiệu suất thời gian thực

Tài liệu tham khảo

Alessandretti G, Broggi A, Cerri P (2007) Vehicle and guard rail detection using radar and vision data fusion. IEEE Trans Intell Transp Syst 8(1):95–105. https://doi.org/10.1109/TITS.2006.888597 Jo Y, Jung I (2014) Analysis of vehicle detection with wsn-based ultrasonic sensors. Sensors 14:4050–14069. https://doi.org/10.3390/s140814050 Perttunen M, Kostakos V, Riekki J, Ojala T (2015) Urban traffic analysis through multi-modal sensing. Pers Ubiquit Comput 19(3):709–721. https://doi.org/10.1007/s00779-015-0833-4 Mimbela L E Y, Klein L A (2000) Summary of vehicle detection and surveillance technologies used in intelligent transportation systems. Technical report, Federal Highway Administration s (FHWA) Intelligent Transportation Systems Joint Program Office Wang, G, Xiao, D, Gu J (2008) Review on vehicle detection based on video for traffic surveillance. In: 2008 IEEE International Conference on Automation and Logistics, pp 2961– 2966 Druzhkov PN, Kustikova VD (2016) A survey of deep learning methods and software tools for image classification and object detection. Pattern Recogn Image Anal 26(1):9–15 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg A C (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37 Ning C, Zhou H, Song Y, Tang J (2017) Inception single shot multibox detector for object detection. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, pp 549–554 Frey B J, Dueck D (2007) Clustering by passing messages between data points. Science 315 (5814):972–976 Henriques J F, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596 Piccardi M (2004) Background subtraction techniques: a review. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583), vol 4. IEEE, pp 3099–3104 Sengar S S, Mukhopadhyay S (2016) A novel method for moving object detection based on block based frame differencing. In: 2016 3rd International Conference on Recent Advances in Information Technology (RAIT). IEEE, pp 467–472 Cucchiara R, Grana C, Piccardi M, Prati A (2003) Detecting moving objects, ghosts, and shadows in video streams. IEEE Trans Pattern Anal Mach Intell 25(10):1337–1342 Harikrishnan P M, Anju T, Nisha J S, Varun G, Palanisamy P (2020) Pixel matching search algorithm for counting moving vehicle in highway traffic videos. Multimedia Tools and Applications:1–20. https://doi.org/10.1007/s11042-020-09666-z Putra B C, Setiyono B, Sulistyaningrum D R, Mukhlash I, et al. (2018) Moving vehicle classification using pixel quantity based on gaussian mixture models. In: 2018 3rd International Conference on Computer and Communication Systems (ICCCS). IEEE, pp 254–257 Zhao Z-Q, Zheng P, Xu S-, Wu X (2019) Object detection with deep learning: A review. IEEE Trans Neural Netw Learn Syst 30(11):3212–3232 Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 Uijlings JRR, Van De Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171 He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916 Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448 Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99 Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387 Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788 Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271 Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv:1804.02767 Fu C-Y, Liu W, Ranga A, Tyagi A, Berg A C (2017) Dssd: Deconvolutional single shot detector. arXiv:1701.06659 Shen Z, Liu Z, Li J, Jiang Y-G, Chen Y, Xue X (2017) Dsod: Learning deeply supervised object detectors from scratch. In: Proceedings of the IEEE international conference on computer vision, pp 1919–1927 Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9 Ning C, Zhou H, Song Y, Tang J (2017) Inception single shot multibox detector for object detection. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, pp 549–554 Thomas A, P. M. H, P. P, Gopi V P (2020) Moving vehicle candidate recognition and classification using inception-resnet-v2. In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), pp 467–472 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826 Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 Rothe R, Guillaumin M, Van Gool L (2015) Non-maximum suppression for object detection by passing messages between windows. In: Cremers D, Reid I, Saito H, Yang M-H (eds) Computer Vision – ACCV 2014. Springer International Publishing, Cham, pp 290–306 Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol 1, pp 886–893 Lowe D G (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110 Gayathri S, Gopi V P, Palanisamy P (2020) Automated classification of diabetic retinopathy through reliable feature selection. Phys Eng Sci Med 43(3):927–945 Kalal Z, Mikolajczyk K, Matas J (2011) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422 Hare S, Golodetz S, Saffari A, Vineet V, Cheng M-M, Hicks S L, Torr PHS (2015) Struck: Structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell 38(10):2096– 2109 Bolme D S, Beveridge J R, Draper B A, Lui Y M (2010) Visual object tracking using adaptive correlation filters. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 2544–2550 Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252 Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256 Kingma D P, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980 Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868. https://doi.org/10.1109/ACCESS.2019.2939201 Liu F, Zeng Z, Jiang R (2017) A video-based real-time adaptive vehicle-counting system for urban roads. PLOS ONE 12(11):1–16. https://doi.org/10.1371/journal.pone.0186098 Abdelwahab M (2019) Fast approach for efficient vehicle counting. Electron Lett 55:20–22. https://doi.org/10.1049/el.2018.6719 Abdelwahab M (2019) Accurate vehicle counting approach based on deep neural networks, pp 1–5 Li S, Chang F, Liu C (2020) Bi-directional dense traffic counting based on spatio-temporal counting feature and counting-lstm network. IEEE Trans Intell Transp Syst:1–13 Liu C, Huynh Q, Sun Y, Reynolds M, Atkinson S (2020) A vision-based pipeline for vehicle counting, speed estimation, and classification. IEEE Trans Intell Transp Syst:1–14 Meng Q, Song H, Zhang Y, Zhang X, Li G, Yang Y (2020) Video-based vehicle counting for expressway: A novel approach based on vehicle detection and correlation-matched tracking using image data from ptz cameras. Math Probl Eng 2020:1–16 Liang H, Song H, Li H, Dai Z (2020) Vehicle counting system using deep learning and multi-object tracking methods. Transp Res Rec 2674(4):114–128. https://doi.org/10.1177/0361198120912742