Application of improved YOLOV5 in plate defect detection

Chenglong Xiong1, Sanbao Hu2, Zhigang Zak Fang2
1Wuhan University of Technology
2School of Automotive Engineering, Wuhan University of Technology, Wuhan, China

Tóm tắt

Từ khóa


Tài liệu tham khảo

Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition 2014:580–587. https://doi.org/10.1109/CVPR.2014.81

He K, Gkioxari G, Dollár P, Girshick R (2017) Mask RCNN. Conf CVPR 2980–2988. https://doi.org/10.48550/arXiv.1703.06870

Girshick R (2015) Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile 1440–1448. https://doi.org/10.48550/arXiv.1504.08083

Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks, in IEEE Transactions on Pattern Analysis and Machine Intelligence 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Amsterdam, The Netherlands 21–37. https://doi.org/10.48550/arXiv.1512.02325

Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: deconvolutional single shot detector. arXiv.  https://doi.org/10.48550/arXiv.1701.06659

Zheng L, Fu C, Zhao Y (2018) Extend the shallow part of single shot multibox detector via convolutional neural network. In Proceedings of the Tenth International Conference on Digital Image Processing (ICDIP 2018), Shanghai, China, 11–14 May 2018. International Society for Optics and Photonics: Shanghai, China (10806):1080613. https://doi.org/10.48550/arXiv.1801.05918

Cui L, Ma R, Lv P, Jiang X, Gao Z, Zhou B, Xu M (2018) MDSSD: multi-scale deconvolutional single shot detector for small objects. arXiv. https://doi.org/10.48550/arXiv.1805.07009

Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition 779–788. https://doi.org/10.48550/arXiv.1506.02640

Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger[C]. Proceedings of the IEEE conference on computer vision and pattern recognition 7263–7271. https://doi.org/10.48550/arXiv.1612.08242

Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift.  https://doi.org/10.48550/arXiv.1502.03167

Redmon J, Farhadi A (2018) Yolov3: an incremental improvement[J/OL]. arXiv. https://doi.org/10.48550/arXiv.1804.02767

Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv.  https://doi.org/10.48550/arXiv.2004.10934

Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv. https://doi.org/10.48550/arXiv.1804.02767

Wang W et al (2019) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. IEEE/CVF International Conference on Computer Vision (ICCV) 2019:8439–8448. https://doi.org/10.1109/ICCV.2019.00853

Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. International Conference on Computer Vision 2011:2018–2025. https://doi.org/10.1109/ICCV.2011.6126474

He K, Zhang X, Ren S et al (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis machine intelligence 37(9):1904–1916

Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020:11531–11539. https://doi.org/10.1109/CVPR42600.2020.01155

Woo S, Park J, Lee JY, Kweon IS (2018) CBAM: convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 6 October 2018; Springer: Munich, Germany 3–19. https://doi.org/10.48550/arXiv.1807.06521

Hu Jie, Shen Li, Albanie S et al (2020) Squeeze-and-excitation networks[J]. IEEE Trans on Pattern Analysis and Machine Intelligence 42(8):2011–2023

Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) GhostNet: more features from cheap operations. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020:1577–1586. https://doi.org/10.1109/CVPR42600.2020.00165

Peng C, Zhang Q, Tang Z, Gui W (2022) Research on mask wearing detection method based on YOLOv5 enhanced model. Comp Eng 1–12. https://doi.org/10.19678/j.issn.1000-3428.0061502

Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Springer: Munich, Germany 784–799, https://doi.org/10.48550/arXiv.1807.11590

Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. https://doi.org/10.48550/arXiv.1902.09630

Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IoU loss: faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; AAAI: New York, NY, USA 12993–13000. https://doi.org/10.48550/arXiv.1911.08287

He J, Erfani S, Ma X, Bailey J, Chi Y, Hua XS (2021) Alpha-IoU: a family of power intersection over union losses for bounding box regression. https://doi.org/10.48550/arXiv.2110.13675

Zhu L, Geng X, Li Z, Liu C (2021) Improving YOLOv5 with attention mechanism for detecting boulders from planetary images. Remote Sens 13:3776. https://doi.org/10.3390/rs13183776