Similarity based person re-identification for multi-object tracking using deep Siamese network

Neural Computing and Applications - Tập 34 - Trang 18171-18182 - 2022
Harun Suljagic1, Ertugrul Bayraktar2, Numan Celebi1
1Department of Information Systems Engineering, Institute of Natural Sciences, Sakarya University, Serdivan, Turkey
2Department of Mechatronics Engineering, Yildiz Technical University, Besiktas, Turkey

Tóm tắt

The process of object tracking involves consistently identifying each instance across frames depending on initial set of object detection(s). Moreover, in multiple object tracking (MOT), the process through tracking-by-detection paradigm consists of performing two common steps consecutively, which are detection and data association. In MOT, it is targeted to associate detections across frames by localizing and identifying all objects of interest. MOT algorithms further keep tracking even the most challenging issues such as revisiting the same view, missing detections, occlusion and temporarily unseen objects, same-appearance objects coexisting in the same frame occur. Hence, re-identification (re-id) appears to be the most powerful tool for assigning the correct identities to each individual instance when aforementioned issues arise. In this work, we propose a similarity-based person re-id framework, called SAT, using a Siamese neural network via shared weights. Once detections are obtained from the backbone SAT applies a Siamese feature extraction model and then we introduce a similarity array for assessing tracklet(s) and detection(s). We examine the performance of SAT on several benchmarks with extensive experiments and statistical tests, where we improve the current state-of-the-art according to commonly used performance metrics with higher accuracy, less ID switches, less false positive and negative rates.

Tài liệu tham khảo

Zhang Y et al (2020) Multiplex labeling graph for near-online tracking in crowded scenes. IEEE Internet Things J 7:7892–7902 Yoon Y, Kim D, Song Y, Yoon K, Jeon M (2021) Online multiple pedestrians tracking using deep temporal appearance matching association. Inf Sci 561:326–351 Cakir S, Cetin A (2021) Visual object tracking using Fourier domain phase information. Signal Image Video Process 16:119–126 Braso G, Lear-Taixe L (2020) Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6246–6256 Wojke N, Bewley A, Paulus D (2018) Simple online and realtime tracking with a deep association metric. In: Proceedings of international conference on image processing, ICIP, pp 3645–3649 Chen L, Ai H, Chen R, Zhuang Z (2019) Aggregate tracklet appearance features for multi-object tracking. IEEE Signal Process. Lett. 26:1613–1617 Wu Y et al (2019) Instance-aware representation learning and association for online multi-person tracking. Pattern Recognit. 94:25–34 Ciaparrone G, Luque F, Sanchey L, Tabik S et al (2020) Deep learning in video multi-object tracking: a survey. Neurocomputing 381:61–88 Yang F, Chang X, Sakti S, Wu Y, Nakamura S (2021) Remot: a model-agnostic refinement for multiple object tracking. Image Vis Comput 106:104091 Liu Q, Chu Q, Liu B, Yu N (2020) Gsm: graph similarity model for multi-object tracking. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence, pp 530–536 Xu Y, Cao Y, Zhang Z (2019) Spatial-temporal relation networks for multi-object tracking. In: Proceedings of the IEEE international conference on computer vision, pp 3987–3997 Sadeghian A, Alahi A, Saverse S (2017) Tracking the untrackable: learning to track multiple cues with long-term dependencies. In: Proceedings of the IEEE international conference on computer vision, pp 300–311 Xu Y, Osep A, Ban Y, Horaud R (2020) How to train your deep multi-object tracker. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 6786–6795 Chu Q et al (2017) Online multi-object tracking using cnn-based single object tracker with spatial-temporal attention mechanism. In: Proceedings of the IEEE international conference on computer vision, pp 4846–4855 Yang M, Wu Y, Jia Y (2017) A hybrid data association framework for robust online multi-object tracking. IEEE Trans Image Process 26:5667–5679 Leal-Taixé L, Milan A, Reid I, Roth S, Schindler K (2015) Motchallenge 2015: towards a benchmark for multi-target tracking. arXiv:1504.01942 Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K (2016) Mot16: a benchmark for multi-object tracking. arXiv:1603.00831 Dendorfer P et al (2020) Mot20: a benchmark for multi object tracking in crowded scenes. arXiv:2003.09003 Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. Springer, Berlin, pp 688–703 Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K (2016) Mot16: a benchmark for multi-object tracking. arXiv:1603.00831 Chavdarova T et al (2018) Wildtrack: a multi-camera hd dataset for dense unscripted pedestrian detection, pp 5030–5039 Li M, Zhu X, Gong S (2019) Unsupervised tracklet person re-identification. IEEE Trans Pattern Anal Mach Intell 42(7):1770–1782 Luiten J et al (2020) Hota: a higher order metric for evaluating multi-object tracking. Int J Comput Vis: IJCV 129:548–578 Fabbri M et al (2021) Motsynth: how can synthetic data help pedestrian detection and tracking?, pp 10849–10859 Peng J et al (2020) Tpm: multiple object tracking with tracklet-plane matching. Pattern Recogn 107:107480 Wu Q, Dai P, Chen P et al (2021) Deep adversarial data augmentation with attribute guided for person re-identification. Signal Image Video Process 15:655–662. https://doi.org/10.1007/s11760-019-01523-3 Nousi P, Triantafyllidou D, Tefas A, Pitas I (2020) Re-identification framework for long term visual object tracking based on object detection and classification. Signal Process Image Commun 88:115969 Bergmann P, Meinhardt T, Leal-Taixé L (2019) Tracking without bells and whistles. CoRR arXiv:1903.05625 Yu T, Li D, Yang Y, Timothy H, Xiang T (2019) Robust person re-identification by modelling feature uncertainty. In: Proceedings of the IEEE international conference on computer vision, pp 552–561 Chen A, Biglari-Abhari M, Wang K (2019) Investigating fast re-identification for multi-camera indoor person tracking. Comput Electr Eng 77:273–288 Li Y, Liu L, Zhu L, Zhang H (2021) Person re-identification based on multi-scale feature learning. Knowl Based Syst 228:107281 Lin Y, Xie L, Wu Y, Yan C, Tian Q (2020) Unsupervised person re-identification via softened similarity learning. CoRR arXiv:2004.03547 Mansouri N, Ammar S, Kessentini Y (2021) Re-ranking person re-identification using attributes learning. Neural Comput Appl 33:12827–12843 Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124 Ristani E, Solera F, Zou RS, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision. Springer, Cham, pp 17–35 Liao L et al (2020) A half-precision compressive sensing framework for end-to-end person re-identification. Neural Comput Appl 32(4):1141–1155 Zheng L, Zhang H, Sun S, Chandraker M, Tian Q (2016) Person re-identification in the wild. arXiv:1604.02531 Zhou S, Wang Y, Zhang F, Wu J (2021) Cross-view similarity exploration for unsupervised cross-domain person re-identification. Neural Comput Appl 33(9):4001–4011 Zhu X, Jing X-Y, Ma F, Cheng L, Ren Y (2019) Simultaneous visual-appearance-level and spatial-temporal-level dictionary learning for video-based person re-identification. Neural Comput Appl 31(11):7303–7315 Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis. Springer, Berlin, Heidelberg, pp 91–102 Zhang J et al (2020) Multiple object tracking by flowing and fusing. CoRRarXiv:2001.11180 Wang Y, Weng X, Kitani K (2020) Joint detection and multi-object tracking with graph neural networks. CoRRarXiv:2006.13164 Meinhardt T, Kirillov A, Leal-Taixé L, Feichtenhofer C (2021) Trackformer: Multi-object tracking with transformers. CoRR arXiv:2101.02702 Shuai B, Berneshawi AG, Modolo D, Tighe J (2020) Multi-object tracking with siamese track-rcnn. CoRR arXiv:2004.07786 Meimetis D, Daramouskas I, Perikos I, Hatzilygeroudis I (2021) Real-time multiple object tracking using deep learning methods. Neural Comput Appl. https://doi.org/10.1007/s00521-021-06391-y Yang K, Song H, Zhang K, Liu Q (2020) Hierarchical attentive siamese network for real-time visual tracking. Neural Comput Appl 32(18):14335–14346 Wu Y, Lim J, Yang MH (2013) Online object tracking: a benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2411–2418 Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 Huang G, Liu Z, Pleiss G, Van Der Maaten L, Weinberger K (2019) Convolutional networks with dense connectivity. IEEE Trans Pattern Anal Mach Intll. https://doi.org/10.1109/TPAMI.2019.2918284 Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767 Yu L, Zhao Y, Zheng X (2021) Towards real -time object tracking with deep siamese network and layerwise aggregation. Signal Image Video Process 15:1303–1311. https://doi.org/10.1007/s11760-021-01861-1 Li S, Zhao Z, Kou L, Zhou Z, Xia G-S (2020) Siamese networks with distractor-reduction method for long-term visual object tracking. Pattern Recogn 112:107698. https://doi.org/10.1016/j.patcog.2020.107698 Bayraktar E, Boyraz P (2017) Analysis of feature detector and descriptor combinations with a localization experiment for various performance metrics. Turki J Electr Eng Comput Sci 25(3):2444–2454 Bayraktar E, Basarkan ME, Celebi N (2020) A low-cost uav framework towards ornamental plant detection and counting in the wild. ISPRS J Photogramm Remote Sens 167:1–11 Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934 Jocher G et al (2020) ultralytics/yolov5: v3.1—bug fixes and performance improvements. https://doi.org/10.5281/zenodo.4154370 Zheng L et al (2015) Scalable person re-identification: a benchmark, pp 1116–1124. https://doi.org/10.1109/ICCV.2015.133 Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification, pp 152–159. https://doi.org/10.1109/CVPR.2014.27 Ciaparrone G et al (2020) Deep learning in video multi-object tracking: a survey. Neurocomputing 381:61–88 Khalkhali MB, Vahedian A, Yazdi HS (2019) Multi-target state estimation using interactive kalman filter for multi-vehicle tracking. IEEE Trans Intell Transp Syst 21(3):1131–1144 Li X, Wang K, Wang W, Li Y (2010) A multiple object tracking method using kalman filter. Piscataway, IEEE, pp 1862–1866 Arulampalam MS, Maskell S, Gordon N, Clapp T (2002) A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Trans Signal Process 50(2):174–188 Smal I, Draegestein K, Galjart N, Niessen W, Meijering E (2008) Particle filtering for multiple object tracking in dynamic fluorescence microscopy images: application to microtubule growth analysis. IEEE Trans Med Imaging 27(6):789–804 Cui Y, Zhang J, He Z, Hu J (2019) Multiple pedestrian tracking by combining particle filter and network flow model. Neurocomputing 351:217–227 Babaee M, Athar A, Rigoll G (2018) Multiple people tracking using hierarchical deep tracklet re-identification. arXiv:1811.04091 Fu Z, Angelini F, Chambers J, Naqvi S (2019) Multi-level cooperative fusion of gm-phd filters for online multiple human tracking. IEEE Trans Multimed 21:2277–2291. https://doi.org/10.1109/TMM.2019.2902480 Xu Y, Osep A, Ban Y, Horaud R, Leal-Taixé L, Alameda-Pineda X (2020) How to train your deep multi-object tracker. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6787–6796 Ren W, Wang X, Tian J, Tang Y, Chan AB (2021) Tracking-by-counting: using network flows on crowd density maps for tracking multiple targets. IEEE Trans Image Process 30:1439–1452. https://doi.org/10.1109/TIP.2020.3044219 Papakis I, Sarkar A, Karpatne A (2020) Gcnnmatch: graph convolutional neural networks for multi-object tracking via sinkhorn normalization. CoRR arXiv:2010.00067 Wang G, Wang Y, Gu R, Hu W, Hwang J (2021) Split and connect: a universal tracklet booster for multi-object tracking. CoRR arXiv:2105.02426 Dai P et al (2021) Learning a proposal classifier for multiple object tracking. CoRR arXiv:2103.07889 Smeulders AW et al (2013) Visual tracking: an experimental survey. IEEE Trans Pattern Anal Mach Intell 36(7):1442–1468 Valmadre J et al (2021) Local metrics for multi-object tracking. arXiv:2104.02631 Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481 Luiten J et al (2021) Hota: a higher order metric for evaluating multi-object tracking. Int J Comput Vis 129(2):548–578