Deep learning in video multi-object tracking: A survey
Tóm tắt
Từ khóa
Tài liệu tham khảo
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, (2014) arXiv:1409.1556.
Szegedy, 2015, Going deeper with convolutions, 1
He, 2016, Deep residual learning for image recognition, 770
Ren, 2015, Faster R-CNN: Towards real-time object detection with region proposal networks, 91
Liu, 2016, SSD: single shot multibox detector, 21
Redmon, 2017, Yolo9000: better, faster, stronger, 7263
Sak, 2014, Long short-term memory recurrent neural network architectures for large scale acoustic modeling
Sundermeyer, 2012, Lstm neural networks for language modeling
Fan, 2014, TTS synthesis with bidirectional LSTM based recurrent neural networks
Marchi, 2014, Multi-resolution linear prediction based features for audio onset detection with bidirectional LSTM neural networks, 2164
W. Luo, J. Xing, A. Milan, X. Zhang, W. Liu, X. Zhao, T.-K. Kim, Multiple object tracking: a literature review, (2014) arXiv:1409.7618.
Camplani, 2016, Multiple human tracking in RGB-depth data: a survey, IET Comput. Vision, 11, 265, 10.1049/iet-cvi.2016.0178
P. Emami, P.M. Pardalos, L. Elefteriadou, S. Ranka, Machine learning methods for solving assignment problems in multi-target tracking, (2018) arXiv:1802.06897.
L. Leal-Taixé, A. Milan, K. Schindler, D. Cremers, I. Reid, S. Roth, Tracking the trackers: an analysis of the state of the art in multiple object tracking, (2017) arXiv:1704.02781.
L. Leal-Taixé, A. Milan, I. Reid, S. Roth, K. Schindler, Motchallenge 2015: towards a benchmark for multi-target tracking, (2015) arXiv:1504.01942.
A. Milan, L. Leal-Taixé, I. Reid, S. Roth, K. Schindler, Mot16: a benchmark for multi-object tracking, (2016) arXiv:1603.00831.
He, 2017, Mask R-CNN, 2961
Dai, 2016, R-FCN: Object detection via region-based fully convolutional networks, 379
Wu, 2006, Tracking of multiple, partially occluded humans based on static body part detection, 1, 951
Bernardin, 2008, Evaluating multiple object tracking performance: the clear MOT metrics, J. Image Video Process., 2008, 1, 10.1155/2008/246309
Ristani, 2016, Performance measures and a data set for multi-target, multi-camera tracking, 17
Stiefelhagen, 2007, Multimodal technologies for perception of humans, 4122
Stiefelhagen, 2008, Multimodal technologies for perception of humans, 4625
Dollár, 2014, Fast feature pyramids for object detection, IEEE Trans. Pattern Anal. Mach. Intell., 36, 1532, 10.1109/TPAMI.2014.2300479
Felzenszwalb, 2009, Object detection with discriminatively trained part-based models, IEEE Trans. Pattern Anal. Mach. Intell., 32, 1627, 10.1109/TPAMI.2009.167
R.B. Girshick, P.F. Felzenszwalb, D. McAllester, Discriminatively trained deformable part models, release 5, 2012, (http://people.cs.uchicago.edu/~rbg/latent-release5/).
Yang, 2016, Exploit all the layers: fst and accurate CNN object detector with scale dependent pooling and cascaded rejection classifiers, 2129
P. Dendorfer, H. Rezatofighi, A. Milan, J. Shi, D. Cremers, I. Reid, S. Roth, K. Schindler, L. Leal-Taixe, Cvpr19 tracking and detection challenge: How crowded can it get?, 2019.
Geiger, 2012, Are we ready for autonomous driving? The Kitti vision benchmark suite, 3354
Geiger, 2013, Vision meets robotics: the Kitti dataset, Int. J. Robot. Res., 32, 1231, 10.1177/0278364913491297
Wang, 2013, Regionlets for generic object detection, 17
L. Wen, D. Du, Z. Cai, Z. Lei, M.-C. Chang, H. Qi, J. Lim, M.-H. Yang, S. Lyu, UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking, (2015) arXiv:1511.04136.
Andriluka, 2010, Monocular 3D pose estimation and tracking by detection, 623
Ferryman, 2009, Pets2009:dataset and challenge, 1
Bewley, 2016, Simple online and realtime tracking, 3464
Kalman, 1960, A new approach to linear filtering and prediction problems, J. Basic Eng., 82, 35, 10.1115/1.3662552
Kuhn, 1955, The hungarian method for the assignment problem, Naval Res. Logist. Q., 2, 83, 10.1002/nav.3800020109
Yu, 2016, POI: multiple object tracking with high performance detection and appearance feature, 36
Bell, 2016, Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks, 2874
Gidaris, 2015, Object detection via a multi-region and semantic segmentation-aware CNN model, 1134
Wojke, 2017, Simple online and realtime tracking with a deep association metric, 3645
Mahmoudi, 2019, Multi-target tracking using CNN-based features: CNNMTT, Multimed. Tools Appl., 78, 7077, 10.1007/s11042-018-6467-6
Wan, 2018, An online and flexible multi-object tracking framework using long short-term memory, 1230
Ujiie, 2018, Interpolation-based object detection using motion vectors for embedded real-time tracking systems, 616
Q. He, J. Wu, G. Yu, C. Zhang, SOT for MOT, (2017) arXiv:1807.01253.
Li, 2017, Multi-person tracking by discriminative affinity model and hierarchical association, 1741
W. Li, M.-C. Chang, S. Lyu, Who did what at where and when: simultaneous multi-person tracking and activity recognition, (2018) arXiv:1807.01253.
Jorquera, 2019, Probability hypothesis density filter using determinantal point processes for multi object tracking, Comput. Vis. Image Underst., 183, 33, 10.1016/j.cviu.2019.04.001
Zhong, 2019, Decision controller for object tracking with deep reinforcement learning, IEEE Access, 7, 28069, 10.1109/ACCESS.2019.2900476
Lu, 2019, Multi-target tracking by non-linear motion patterns based on hierarchical network flows, Multimed. Syst., 24, 383, 10.1007/s00530-019-00614-y
Tang, 2017, Multiple people tracking by lifted multicut and person re-identification, 3539
Ran, 2019, A robust multi-athlete tracking algorithm by exploiting discriminant features and long-term dependencies, 411
Hu, 2018, An automatic tracking method for multiple cells based on multi-feature fusion, IEEE Access, 6, 69782, 10.1109/ACCESS.2018.2880563
Zhang, 2019, Automatic individual pig detection and tracking in pig farms, Sensors, 19, 1188, 10.3390/s19051188
Zhou, 2018, Online multi-target tracking with tensor-based high-order graph matching, 1809
Danelljan, 2017, Eco: efficient convolution operators for tracking, 6638
Dalal, 2005, Histograms of oriented gradients for human detection, 1, 886
Van De Weijer, 2009, Learning color names for real-world applications, IEEE Trans. Image Process., 18, 1512, 10.1109/TIP.2009.2019809
Lu, 2017, Online video object detection using association LSTM, 2344
Kieritz, 2018, Joint detection and online multi-object tracking, 1459
Zhao, 2018, Multi-object tracking with correlation filter for autonomous vehicle, Sensors, 18, 2004, 10.3390/s18072004
Pearson, 1901, Liii. on lines and planes of closest fit to systems of points in space, Philos. Mag. J. Sci., 2, 559, 10.1080/14786440109462720
Wang, 2017, Large margin object tracking with circulant feature maps, 4021
Redmon, 2016, You only look once: unified, real-time object detection, 779
J. Redmon, A. Farhadi, Yolov3: an incremental improvement, (2018) arXiv:1804.02767.
Kim, 2018, Online tracker optimization for multi-pedestrian tracking using a moving vehicle camera, IEEE Access, 6, 48675, 10.1109/ACCESS.2018.2867621
Sharma, 2018, Beyond pixels: Leveraging geometry and shape cues for online multi-object tracking, 3508
Ren, 2017, Accurate single stage detector using recurrent rolling convolution, 5420
Xiang, 2017, Subcategory-aware convolutional neural networks for object proposals and detection, 924
Pernici, 2018, Memory based online learning of deep representations from video streams, 2324
Hu, 2017, Finding tiny faces, 951
Min, 2018, A new approach to track multiple vehicles with the combination of robust detection and two classifiers, IEEE Trans. Intell. Transp. Syst., 19, 174, 10.1109/TITS.2017.2756989
Barnich, 2011, Vibe: a universal background subtraction algorithm for video sequences, IEEE Trans.Image Process., 20, 1709, 10.1109/TIP.2010.2101613
Yu, 2017, A model for fine-grained vehicle classification based on deep learning, Neurocomputing, 257, 97, 10.1016/j.neucom.2016.09.116
Bullinger, 2017, Instance flow based online multiple object tracking, 785
Dai, 2016, Instance-aware semantic segmentation via multi-task network cascades, 3150
Farnebäck, 2003, Two-frame motion estimation based on polynomial expansion, 363
Revaud, 2016, Deepmatching: hierarchical deformable dense matching, Int. J. Comput.Vis., 120, 300, 10.1007/s11263-016-0908-3
Hu, 2016, Efficient coarse-to-fine patchmatch for large displacement optical flow, 5704
Wang, 2014, Learning deep features for multiple object tracking by using a multi-task learning strategy, 838
Cadieu, 2009, Learning transformational invariants from natural movies, 209
Kim, 2015, Multiple hypothesis tracking revisited, 4696
Zheng, 2017, Person re-identification in the wild, 1367
Zheng, 2015, Scalable person re-identification: A benchmark, 1116
Gray, 2008, Viewpoint invariant pedestrian recognition with an ensemble of localized features, 262
Li, 2014, Deepreid: deep filter pairing neural network for person re-identification, 152
Chen, 2017, Enhancing detection model for multiple hypothesis tracking, 18
Yang, 2017, A hybrid data association framework for robust online multi-object tracking, IEEE Trans. Image Process., 26, 5667, 10.1109/TIP.2017.2745103
Girshick, 2015, Region-based convolutional networks for accurate object detection and segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 38, 142, 10.1109/TPAMI.2015.2437384
Wang, 2017, Robust tracking of fish schools using CNN for head identification, Multimed. Tools Appl., 76, 23679, 10.1007/s11042-016-4045-3
S. Zagoruyko, N. Komodakis, Wide residual networks, (2016) arXiv:1605.07146.
Kim, 2018, Multi-object tracking with neural gating using bilinear LSTM, 200
Bae, 2017, Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking, IEEE Trans. Pattern Anal. Mach. Intell., 40, 595, 10.1109/TPAMI.2017.2691769
Ullah, 2018, Deep feature based end-to-end transportation network for multi-target tracking, 3738
Fang, 2018, Recurrent autoregressive networks for online multi-object tracking, 466
Xiao, 2016, Learning deep feature representations with domain guided dropout for person re-identification, 1249
Fu, 2018, GM-PHD filter based online multiple human tracking using deep discriminative correlation matching, 4299
Vo, 2006, The gaussian mixture probability hypothesis density filter, IEEE Trans. Signal Process., 54, 4091, 10.1109/TSP.2006.881190
L. Wen, D. Du, S. Li, X. Bian, S. Lyu, Learning non-uniform hypergraph for multi-object tracking, Proceedings of the 33rd AAAI Conference on Artificial Intelligence (2019).
Russakovsky, 2015, Imagenet large scale visual recognition challenge, Int. J. Comput. Vis., 115, 211, 10.1007/s11263-015-0816-y
Korn, 2000, Influence sets based on reverse nearest neighbor queries, ACM Sigmod Record, 29, 201, 10.1145/335191.335415
Sheng, 2018, Heterogeneous association graph fusion for target association in multiple object tracking, IEEE Trans. Circuits Syst. Video Technol., 29, 3269, 10.1109/TCSVT.2018.2882192
Chen, 2019, Recurrent metric networks and batch multiple hypothesis for multi-object tracking, IEEE Access, 7, 3093, 10.1109/ACCESS.2018.2889187
Pishchulin, 2016, Deepcut: joint subset partition and labeling for multi person pose estimation, 4929
Kim, 2016, Similarity mapping with enhanced siamese network for multi-object tracking
Bromley, 1994, Signature verification using a “siamese” time delay neural network, 737
Wang, 2016, Joint learning of convolutional neural networks and temporally constrained metrics for tracklet association, 1
Zhang, 2016, Tracking persons-of-interest via adaptive discriminative features, 415
Leal-Taixé, 2016, Learning by tracking: siamese CNN for robust target association, 33
Leal-Taixé, 2011, Everybody needs somebody: modeling social and grouping behavior on a linear programming multiple people tracker, 120
Son, 2017, Multi-object tracking with quadruplet convolutional neural networks, 5620
Maksai, 2019, Eliminating exposure bias and loss-evaluation mismatch in multiple object tracking, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
A. Hermans, L. Beyer, B. Leibe, In defense of the triplet loss for person re-identification, (2017) arXiv:1703.07737.
Zhu, 2018, Online multi-object tracking with dual matching attention networks, 366
Ma, 2018, Trajectory factory:tracklet cleaving and re-connection by deep siamese BI-GRU for multiple object tracking, 1
Zhou, 2018, Deep continuous conditional random fields with asymmetric inter-object constraints for online multi-object tracking, IEEE Trans. Circuits Sys. Video Technol., 29, 1011, 10.1109/TCSVT.2018.2825679
Long, 2018, Real-time multiple people tracking with deeply learned candidate selection and person re-identification
Lee, 2019, Multiple object tracking via feature pyramid siamese networks, IEEE Access, 7, 8181, 10.1109/ACCESS.2018.2889442
F.N. Iandola, S. Han, M.W. Moskewicz, K. Ashraf, W.J. Dally, K. Keutzer, Squeezenet: alexnet-level accuracy with 50x fewer parameters and < 0.5 MB model size, (2016) arXiv:1602.07360.
Ullah, 2017, A hierarchical feature model for multi-target tracking, 2612
Mallat, 1993, Matching pursuits with time-frequency dictionaries, IEEE Trans. Signal Process., 41, 3397, 10.1109/78.258082
Sadeghian, 2017, Tracking the untrackable: learning to track multiple cues with long-term dependencies, 300
Chu, 2017, Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism, 4836
Ozuysal, 2009, Fast keypoint recognition using random ferns, IEEE Trans. Pattern Anal. Mach. Intell., 32, 448, 10.1109/TPAMI.2009.23
Ullah, 2018, A directed sparse graphical model for multi-target tracking, 1816
Rabiner, 1986, An introduction to hidden markov models, IEEE ASSP Mag., 3, 4, 10.1109/MASSP.1986.1165342
Wang, 2017, Online multiple object tracking via flow and convolutional features, 3630
Ma, 2015, Hierarchical convolutional features for visual tracking, 3074
Lucas, 1981, An iterative image registration technique with an application to stereo vision, 121
Rosello, 2018, Multi-agent reinforcement learning for multi-object tracking, 1397
Babaee, 2018, Occlusion handling in tracking multiple people using rnn, 2715
Milan, 2017, Online multi-target tracking using recurrent neural networks
Xiang, 2015, Learning to track: online multi-object tracking by decision making, 4705
Fang, 2017, RMPE: regional multi-person pose estimation, 2334
Liang, 2018, Lstm multiple object tracker combining multiple cues, 2351
Zhou, 2018, Deep self-paced learning for person re-identification, Pattern Recognit., 76, 739, 10.1016/j.patcog.2017.10.005
Yoon, 2019, Data association for multi-object tracking via deep neural networks, Sensors, 19, 559, 10.3390/s19030559
Robicquet, 2016, Learning social etiquette: human trajectory understanding in crowded scenes, 549
Andriluka, 2008, People-tracking-by-detection and people-detection-by-tracking, 1
K. Cho, B. Van Merriënboer, D. Bahdanau, Y. Bengio, On the properties of neural machine translation: encoder-decoder approaches, (2014) arXiv:1409.1259.
Andres, 2015, Lifting of multicuts, 3
Weinzaepfel, 2013, Deepflow: large displacement optical flow with deep matching, 1385
Keuper, 2015, Efficient decomposition of image and mesh graphs by lifted multicuts, 1751
Chen, 2017, Online multi-object tracking with convolutional neural networks, 645
Arulampalam, 2002, A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking, IEEE Trans. Signal Process., 50, 174, 10.1109/78.978374
Sanchez-Matilla, 2016, Online multi-target tracking with strong and weak detections, 84
Ma, 2018, Customized multi-person tracker
Tang, 2016, Multi-person tracking by multicut and deep matching, 100
Ren, 2018, Collaborative deep reinforcement learning for multi-object tracking, 586
Nam, 2016, Learning multi-domain convolutional neural networks for visual tracking, 4293
Jiang, 2018, Precise regression for bounding box correction for improved tracking based on deep reinforcement learning, 1643
Lee, 2016, Multi-class multi-object tracking using changing point detection, 68
Hoak, 2017, Image-based multi-target tracking through multi-bernoulli filtering with interactive likelihoods, Sensors, 17, 501, 10.3390/s17030501
Henschel, 2018, Fusion of head and full-body detectors for multi-object tracking, 1428
Stewart, 2016, End-to-end people detection in crowded scenes, 2325
Gan, 2018, Online CNN-based multiple object tracking with enhanced model updates and identity association, Signal Process.: Image Commun., 66, 95
Xiang, 2019, Online multi-object tracking based on feature representation and Bayesian filtering within a deep learning architecture, IEEE Access
Chu, 2019, Online multi-object tracking with instance-aware tracker and dynamic model refreshment, 161
Watkins, 1989
Takeuchi, 2006, A unifying framework for detecting outliers and change points from time series, IEEE Trans. Knowl. Data Eng., 18, 482, 10.1109/TKDE.2006.1599387
Dollar, 2009, Pedestrian detection: a benchmark, 304
Hoseinnezhad, 2012, Visual tracking of numerous targets via multi-bernoulli filtering of image data, Pattern Recognit., 45, 3625, 10.1016/j.patcog.2012.04.004
Milan, 2014, Improving global multi-target tracking with local updates, 174
Frank, 1956, An algorithm for quadratic programming, Naval Res. Logist. Q., 3, 95, 10.1002/nav.3800030109
Cao, 2017, Realtime multi-person 2D pose estimation using part affinity fields, 7291
Zhao, 2017, Deeply-learned part-aligned representations for person re-identification, 3219
Henriques, 2014, High-speed tracking with Kernelized correlation filters, IEEE Trans. Pattern Anal. Mach. Intell., 37, 583, 10.1109/TPAMI.2014.2345390
Wen, 2014, Multiple target tracking based on undirected hierarchical relation hypergraph, 1282
Qian, 2014, Automatically detect and track multiple fish swimming in shallow water with frequent occlusion, PloS One, 9, e106506, 10.1371/journal.pone.0106506
Pirsiavash, 2011, Globally-optimal greedy algorithms for tracking a variable number of objects, 1201
Gold, 1996, Softmax to softassign: neural network algorithms for combinatorial optimization, J. Artif. Neural Netw., 2, 381
Huang, 2008, Robust object tracking by hierarchical association of detection responses, 788
Benenson, 2012, Pedestrian detection at 100 frames per second, 2903