JUIVCDv1: development of a still-image based dataset for indian vehicle classification

Sourajit Maity1, Debam Saha1, Pawan Kumar Singh2, Ram Sarkar1
1Department Computer Science and Engineering, Jadavpur University, Kolkata, India
2Department of Information Technology, Jadavpur University, Kolkata, India

Tóm tắt

An automatic vehicle classification (AVC) system designed from either still images or videos has the potential to bring significant benefits to the development of a traffic control system. On AVC, numerous articles have been published in the literature. Over the years, researchers in this domain have created and used a variety of datasets, but most often, these datasets may not reflect the exact scenarios of the Indian subcontinent due to specific peculiarities of the road conditions, road congestion nature, and vehicle types usually seen in Indian subcontinent. The primary goal of this paper is to create a new still image dataset, called JUIVCDv1, which contains 12 different local vehicle classes that are collected using mobile cameras in a different way for developing an automated vehicle management system. We have also discussed the characteristics of the current datasets, and various other factors taken into account while creating the dataset for the Indian scenario. Apart from this, we have benchmarked the results on the developed dataset using eight state-of-the-art pre-trained convolutional neural network (CNN) models, namely Xception, InceptionV3, DenseNet121, MobileNetV2, and VGG16, NasNetMobile, ResNet50 and ResNet152. Among these, the Xception, InceptionV3 and DenseNet121 models produce the best classification accuracy scores of 0.94, 0.93 and 0.92 respectively. These models are further utilized to make an ensemble model to enhance the performance of the overall categorization model. Majority voting-based ensemble, Weighted average-based ensemble, and Sum rule-based ensemble approaches are used as ensemble models that give accuracy scores of 0.95, 0.94, and 0.94, respectively.

Tài liệu tham khảo

Islam A, Mallik S, Roy A, Agrebi M, Singh PK (2023) A filter-based feature selection framework for vehicle/non-vehicle classification. In: Measurements and instrumentation for machine vision, pp 677–684 . Taylor Bhattacharya D, Bhattacharyya A, Agrebi M, Roy A, Singh P (2022) Dfe-avd: deep feature ensemble for automatic vehicle detection. In: Proceedings of international conference on intelligence computing systems and applications (ICICSA 2022) Maity S, Chakraborty A, Singh PK, Sarkar R (2023) Performance comparison of various yolo models for vehicle detection: An experimental study. In: International conference on data analytics & management, pp 677–684. Springer Zha Z, Tang H, Sun Y, Tang J (2023) Boosting few-shot fine-grained recognition with background suppression and foreground alignment. IEEE Transactions on Circuits and Systems for Video Technology Tang H, Yuan C, Li Z (2022) Tang J Learning attention-guided pyramidal features for few-shot fine-grained recognition. Pattern Recognit 130:108792 Gayen S, Maity S, Singh PK, Geem ZW, Sarkar R (2023) Two decades of vehicle make and model recognition–survey, challenges and future directions. Journal of King Saud University-Computer and Information Sciences, pp 101885 Li Z, Tang H, Peng Z, Qi G-J, Tang J (2023) Knowledge-guided semantic transfer network for few-shot image recognition. IEEE Transactions on Neural Networks and Learning Systems Tang H, Li Z, Peng Z, Tang J (2020) Blockmix: meta regularization and self-calibrated inference for metric-based meta-learning. In: Proceedings of the 28th ACM international conference on multimedia, pp 610–618 Bhattacharyya A, Bhattacharya A, Maity S, Singh PK, Sarkar R (2023) Juvdsi v1: developing and benchmarking a new still image database in indian scenario for automatic vehicle detection. Multimedia Tools and Applications, pp 1–33 Ali A, Sarkar R, Das DK (2023) Iruvd: a new still-image based dataset for automatic vehicle detection. Multimedia Tools and Applications, pp 1–27 Dong N, Yan S, Tang H, Tang J, Zhang L (2023) Multi-view information integration and propagation for occluded person re-identification. arXiv:2311.03828 Yan S, Tang H, Zhang L, Tang J (2023) Image-specific information suppression and implicit local alignment for text-based person search. IEEE Transactions on Neural Networks and Learning Systems Yan S, Dong N, Zhang L, Tang J (2023) Clip-driven fine-grained text-image person re-identification. IEEE Transactions on Image Processing Yan S, Zhang Y, Xie M, Zhang D (2022) Yu Z Cross-domain person re-identification with pose-invariant feature decomposition and hypergraph structure alignment. Neurocomputing 467:229–241 Yan S, Dong N, Liu J, Zhang L, Tang J (2023) Learning comprehensive representations with richer self for text-to-image person re-identification. In: Proceedings of the 31st ACM international conference on multimedia, pp 6202–6211 Luo Z, Branchaud-Charron F, Lemaire C, Konrad J, Li S, Mishra A, Achkar A, Eichel J, Jodoin P-M (2018) Mio-tcd: a new benchmark dataset for vehicle classification and localization. IEEE Trans Image Process 27(10):5129–5141. https://doi.org/10.1109/TIP.2018.2848705 Lin Y-L, Morariu VI, Hsu W, Davis LS (2014) Jointly optimizing 3d model fitting and fine-grained classification. In: Computer Vision–ECCV 2014: 13th european conference. Zurich, Switzerland, September 6-12, 2014, Proceedings, Part IV 13, pp 466–480. Springer Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: 2013 IEEE international conference on computer vision workshops, pp 554–561 . https://doi.org/10.1109/ICCVW.2013.77 Dong Z, Wu Y, Pei M (2015) Jia Y Vehicle type classification using a semisupervised convolutional neural network. IEEE Trans Intell Trans Syst 16(4):2247–2256. https://doi.org/10.1109/TITS.2015.2402438 Sochor J, Herout A, Havel J (2016) Boxcars: 3d boxes as cnn input for improved fine-grained vehicle recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 3006–3015. https://doi.org/10.1109/CVPR.2016.328 Yang L, Luo P, Loy CC, Tang X (2015) A large-scale car dataset for fine-grained categorization and verification. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3973–3981. https://doi.org/10.1109/CVPR.2015.7299023 Tabassum S, Ullah S, Al-nur NH, Shatabda S (2020) Poribohon-bd: Bangladeshi local vehicle image dataset with annotation for classification. Data in Brief 33:106465. https://doi.org/10.1016/j.dib.2020.106465 Hasan MM, Wang Z, Hussain MAI, Fatima K (2021) Bangladeshi native vehicle classification based on transfer learning with deep convolutional neural network. Sensors 21(22):7545 Lu L, Wang P (2020) Huang H A large-scale frontal vehicle image dataset for fine-grained vehicle categorization. IEEE Trans Intell Trans Syst 23(3):1818–1828 Kramberger T (2020) Potočnik B Lsun-stanford car dataset: enhancing large-scale car image datasets using deep learning for usage in gan training. Appl Sci 10(14):4913 Maity S, Bhattacharyya A, Singh PK, Kumar M, Sarkar R (2022) Last decade in vehicle detection and classification: A comprehensive survey. Archives of Computational Methods in Engineering, pp 1–38 Sun W, Zhang G, Zhang X, Zhang X (2021) Ge N Fine-grained vehicle type classification using lightweight convolutional neural network with feature optimization and joint learning strategy. Multimed Tools Appl 80:30803–30816 Silva B, Barbosa-Anda FR (2022) Batista J Exploring multi-loss learning for multi-view fine-grained vehicle classification. J Intell Robot Syst 105(2):43 Elkerdawy S, Ray N, Zhang H (2018) Fine-grained vehicle classification with unsupervised parts co-occurrence learning. In: Proceedings of the european conference on computer vision (ECCV) Workshops, pp 0–0 Silva B, Oliveira R, Barbosa-Anda FR, Batista J (2021) Multi-view and multi-scale fine-grained vehicle classification with channel convolution feature fusion. In: 2021 IEEE international intelligent transportation systems conference (ITSC), pp 3018–3025. IEEE Sahin O, Nezafat RV (2021) Cetin M Methods for classification of truck trailers using side-fire light detection and ranging (lidar) data. J Intell Trans Syst 26(1):1–13 Liu P, Fu H (2021) Ma H An end-to-end convolutional network for joint detecting and denoising adversarial perturbations in vehicle classification. Comput Vis Media 7:217–227 Butt MA, Khattak AM, Shafique S, Hayat B, Abid S, Kim K-I, Ayub MW, Sajid A (2021) Adnan A Convolutional neural network based vehicle classification in adverse illuminous conditions for intelligent transportation systems. Complexity 2021:1–11 Guo L, Li R (2021) Jiang B An ensemble broad learning scheme for semisupervised vehicle type classification. EEE Trans Neural Netw Learn Syst 32(12):5287–5297 Mohine S, Bansod BS, Bhalla R (2022) Basra A Acoustic modality based hybrid deep 1d cnn-bilstm algorithm for moving vehicle classification. IEEE Trans Intell Trans Syst 23(9):16206–16216 Tzutalin D (2022) Labelimg is a graphical image annotation tool and label object bounding boxes in images. https://github.com/tzutalin/labelImg Tang H, Liu J, Yan S, Yan R, Li Z, Tang J (2023) M3net: multi-view encoding, matching, and fusion for few-shot fine-grained action recognition. In: Proceedings of the 31st ACM international conference on multimedia, pp 1719–1728 Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826 Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 Mascarenhas S, Agarwal M (2021) A comparison between vgg16, vgg19 and resnet50 architecture frameworks for image classification. In: 2021 International conference on disruptive technologies for multi-disciplinary research and applications (CENTCON), vol 1, pp 96–99. IEEE Naskinova I (2023) Transfer learning with nasnet-mobile for pneumonia x-ray classification. Asian-Eur J Math 16(01):2250240 Shah FA, Khan MA Sharif M, Tariq U, Khan A, Kadry S, Thinnukool O (2022) A cascaded design of best features selection for fruit diseases recognition. Comput Mater Contin 70:1491–1507 Ballabio D, Todeschini R (2019) Consonni V Recent advances in high-level fusion methods to classify multiple analytical chemical data. Data Handl Sci Technol 31:129–155 Dogan A, Birant D A weighted majority voting ensemble approach for classification. In: 2019 4th International conference on computer science and engineering (UBMK), pp 1–6 (2019). IEEE Dey S, Roychoudhury R, Malakar S (2022) Sarkar R An optimized fuzzy ensemble of convolutional neural networks for detecting tuberculosis from chest x-ray images. Appl Soft Comput 114:108094 Bühlmann P (2012) Bagging, boosting and ensemble methods. Concepts and methods. Handbook of computational statistics, pp 985–1022 Neloy MAI, Nahar N, Hossain MS, Andersson K (2022) A weighted average ensemble technique to predict heart disease. In: Proceedings of the third international conference on trends in computational and cognitive engineering: TCCE 2021, pp 17–29. Springer Buckland M (1994) Gey F The relationship between recall and precision. J Am Soc Inf Sci 45(1):12–19 Chicco D (2020) Jurman G The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genomics 21(1):1–13 Townsend J.T Theoretical analysis of an alphabetic confusion matrix. Perception & Psychophysics 9:40–50 (1971) Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626 Pramanik R, Banerjee B, Efimenko G, Kaplun D (2023) Sarkar R Monkeypox detection from skin lesion images using an amalgamation of cnn models aided with beta function-based normalization scheme. Plos one 18(4):0281815 Tabassum S, Ullah S, Al-Nur N.H, Shatabda S Poribohon-bd: Bangladeshi local vehicle image dataset with annotation for classification. Data in Brief 33 (2020)