Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo
Bộ phát hiện đa hộp đơn Inception với cụm hóa lan tỏa và ứng dụng của chúng trong việc đếm phương tiện đa lớp
Tóm tắt
Việc phát hiện và đếm phương tiện đa lớp trong các hệ thống giám sát giao thông dựa trên video với hiệu suất thời gian thực và độ chính xác chấp nhận được là một thách thức. Bài báo này đề xuất một mạng nơ-ron tích chập đa hộp đơn đã được điều chỉnh tên là Inception-SSD (ISSD) để phát hiện phương tiện và một thuật toán khớp trọng tâm để đếm phương tiện. Một khối giống như Inception được giới thiệu để thay thế các lớp đặc trưng bổ sung trong SSD gốc nhằm xử lý việc phát hiện phương tiện đa tỷ lệ, nhằm cải thiện khả năng phát hiện các phương tiện nhỏ hơn. Phương pháp không tối đa (Non-Maximum Suppression - NMS) được thay thế bằng cụm hóa lan tỏa (Affinity Propagation Clustering - APC) để cải thiện việc phát hiện các phương tiện bị che khuất gần nhau. Đối với hình ảnh đầu vào 300 × 300, trên tập dữ liệu thử nghiệm PASCAL VOC 2007, ISSD đề xuất đạt được độ chính xác trung bình (mean Average Precision - mAP) là 79.3 và chạy trên NVIDIA RTX2080Ti; mạng đạt tốc độ 52.3 khung hình mỗi giây. ISSD với APC tạo ra cải thiện 2.7% về mAP so với SSD300 gốc trong khi vẫn giữ được hiệu quả thời gian. Bằng cách sử dụng thuật toán khớp trọng tâm, các phương tiện được đếm theo lớp với chỉ số F1 có trọng số là 98.5%, vượt trội hơn khá nhiều so với các công trình nghiên cứu gần đây khác.
Từ khóa
#phát hiện phương tiện #đếm phương tiện #mạng nơ-ron tích chập #cụm hóa lan tỏa #học sâu #SSD #hiệu suất thời gian thựcTài liệu tham khảo
Alessandretti G, Broggi A, Cerri P (2007) Vehicle and guard rail detection using radar and vision data fusion. IEEE Trans Intell Transp Syst 8(1):95–105. https://doi.org/10.1109/TITS.2006.888597
Jo Y, Jung I (2014) Analysis of vehicle detection with wsn-based ultrasonic sensors. Sensors 14:4050–14069. https://doi.org/10.3390/s140814050
Perttunen M, Kostakos V, Riekki J, Ojala T (2015) Urban traffic analysis through multi-modal sensing. Pers Ubiquit Comput 19(3):709–721. https://doi.org/10.1007/s00779-015-0833-4
Mimbela L E Y, Klein L A (2000) Summary of vehicle detection and surveillance technologies used in intelligent transportation systems. Technical report, Federal Highway Administration s (FHWA) Intelligent Transportation Systems Joint Program Office
Wang, G, Xiao, D, Gu J (2008) Review on vehicle detection based on video for traffic surveillance. In: 2008 IEEE International Conference on Automation and Logistics, pp 2961– 2966
Druzhkov PN, Kustikova VD (2016) A survey of deep learning methods and software tools for image classification and object detection. Pattern Recogn Image Anal 26(1):9–15
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg A C (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Ning C, Zhou H, Song Y, Tang J (2017) Inception single shot multibox detector for object detection. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, pp 549–554
Frey B J, Dueck D (2007) Clustering by passing messages between data points. Science 315 (5814):972–976
Henriques J F, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596
Piccardi M (2004) Background subtraction techniques: a review. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583), vol 4. IEEE, pp 3099–3104
Sengar S S, Mukhopadhyay S (2016) A novel method for moving object detection based on block based frame differencing. In: 2016 3rd International Conference on Recent Advances in Information Technology (RAIT). IEEE, pp 467–472
Cucchiara R, Grana C, Piccardi M, Prati A (2003) Detecting moving objects, ghosts, and shadows in video streams. IEEE Trans Pattern Anal Mach Intell 25(10):1337–1342
Harikrishnan P M, Anju T, Nisha J S, Varun G, Palanisamy P (2020) Pixel matching search algorithm for counting moving vehicle in highway traffic videos. Multimedia Tools and Applications:1–20. https://doi.org/10.1007/s11042-020-09666-z
Putra B C, Setiyono B, Sulistyaningrum D R, Mukhlash I, et al. (2018) Moving vehicle classification using pixel quantity based on gaussian mixture models. In: 2018 3rd International Conference on Computer and Communication Systems (ICCCS). IEEE, pp 254–257
Zhao Z-Q, Zheng P, Xu S-, Wu X (2019) Object detection with deep learning: A review. IEEE Trans Neural Netw Learn Syst 30(11):3212–3232
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Uijlings JRR, Van De Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv:1804.02767
Fu C-Y, Liu W, Ranga A, Tyagi A, Berg A C (2017) Dssd: Deconvolutional single shot detector. arXiv:1701.06659
Shen Z, Liu Z, Li J, Jiang Y-G, Chen Y, Xue X (2017) Dsod: Learning deeply supervised object detectors from scratch. In: Proceedings of the IEEE international conference on computer vision, pp 1919–1927
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Ning C, Zhou H, Song Y, Tang J (2017) Inception single shot multibox detector for object detection. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, pp 549–554
Thomas A, P. M. H, P. P, Gopi V P (2020) Moving vehicle candidate recognition and classification using inception-resnet-v2. In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), pp 467–472
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Rothe R, Guillaumin M, Van Gool L (2015) Non-maximum suppression for object detection by passing messages between windows. In: Cremers D, Reid I, Saito H, Yang M-H (eds) Computer Vision – ACCV 2014. Springer International Publishing, Cham, pp 290–306
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol 1, pp 886–893
Lowe D G (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Gayathri S, Gopi V P, Palanisamy P (2020) Automated classification of diabetic retinopathy through reliable feature selection. Phys Eng Sci Med 43(3):927–945
Kalal Z, Mikolajczyk K, Matas J (2011) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422
Hare S, Golodetz S, Saffari A, Vineet V, Cheng M-M, Hicks S L, Torr PHS (2015) Struck: Structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell 38(10):2096– 2109
Bolme D S, Beveridge J R, Draper B A, Lui Y M (2010) Visual object tracking using adaptive correlation filters. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 2544–2550
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256
Kingma D P, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980
Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868. https://doi.org/10.1109/ACCESS.2019.2939201
Liu F, Zeng Z, Jiang R (2017) A video-based real-time adaptive vehicle-counting system for urban roads. PLOS ONE 12(11):1–16. https://doi.org/10.1371/journal.pone.0186098
Abdelwahab M (2019) Fast approach for efficient vehicle counting. Electron Lett 55:20–22. https://doi.org/10.1049/el.2018.6719
Abdelwahab M (2019) Accurate vehicle counting approach based on deep neural networks, pp 1–5
Li S, Chang F, Liu C (2020) Bi-directional dense traffic counting based on spatio-temporal counting feature and counting-lstm network. IEEE Trans Intell Transp Syst:1–13
Liu C, Huynh Q, Sun Y, Reynolds M, Atkinson S (2020) A vision-based pipeline for vehicle counting, speed estimation, and classification. IEEE Trans Intell Transp Syst:1–14
Meng Q, Song H, Zhang Y, Zhang X, Li G, Yang Y (2020) Video-based vehicle counting for expressway: A novel approach based on vehicle detection and correlation-matched tracking using image data from ptz cameras. Math Probl Eng 2020:1–16
Liang H, Song H, Li H, Dai Z (2020) Vehicle counting system using deep learning and multi-object tracking methods. Transp Res Rec 2674(4):114–128. https://doi.org/10.1177/0361198120912742