Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Mạng lưới hợp nhất hai mô hình với sự chú ý bên ngoài cho theo dõi RGBT

Springer Science and Business Media LLC - Tập 79 - Trang 17020-17041 - 2023

Kaixiang Yan¹, Jiatian Mei¹, Dongming Zhou¹, Lifen Zhou^1,2

¹School of Information and Engineering, Yunnan University, Kunming, China

²College of Information Engineering, QuJing Normal University, Qujing, China

Tóm tắt

Do tính bổ sung độc đáo giữa hình ảnh RGB và hình ảnh nhiệt (RGBT), việc theo dõi RGBT đã dần trở thành một lĩnh vực nghiên cứu quan trọng. Để đạt được hiệu suất theo dõi mạnh mẽ, việc khai thác cả thông tin cục bộ và thông tin toàn cầu trở thành một vấn đề quan trọng cho việc theo dõi RGBT. Được lấy cảm hứng từ cơ chế chú ý bên ngoài, chúng tôi đã thiết kế một mạng lưới hợp nhất hai mô hình với sự chú ý bên ngoài (EDFNet) được trang bị mô-đun hướng dẫn chú ý bên ngoài (EGM). EGM dựa trên hai đơn vị ghi nhớ bên ngoài tạo ra các bản đồ chú ý bên ngoài giúp phân bổ lại trọng số theo các mối tương quan. Để tránh sự suy giảm đặc trưng, EDFNet giới thiệu các lối tắt để tái định hướng và hợp nhất một cách thích ứng các đặc trưng từ các lối đi tắt và chú ý bên ngoài với trọng số thích ứng. Hơn nữa, xem xét sự khác biệt của hình ảnh RGBT, chúng tôi thiết kế một phương pháp tăng cường đặc trưng không đối xứng bao gồm hướng dẫn thông tin chi tiết (DiG) và nâng cao thông tin cấu trúc. DiG nhằm tối ưu hóa các đặc trưng chi tiết và kết cấu của đặc trưng RGB thông qua tối ưu hóa chi tiết trục. SiE tận dụng đặc tính cộng dồn để tăng cường các đặc trưng cấu trúc. Đồng thời, chúng tôi triển khai một hàm mất mát có tên là mất mát trọng số được tăng cường một phần trong EDFNet để phù hợp với kiến trúc mới này. Kết quả đánh giá dựa trên RGBT234 và GTOT lần lượt xác nhận rằng EDFNet đạt hiệu suất theo dõi tốt hơn so với các bộ theo dõi khác.

Từ khóa

#theo dõi RGBT #mạng lưới hợp nhất hai mô hình #chú ý bên ngoài #tối ưu hóa chi tiết #đặc trưng cấu trúc

Tài liệu tham khảo

Huang L, Song K, Wang J, Niu M, Yan Y (2021) Multi-graph fusion and learning for rgbt image saliency detection. IEEE Trans Circuits Syst Video Technol 99:1–1 Huang L, Song K, Gong A, Liu C, Yan Y (2020) Rgb-t saliency detection via low-rank tensor learning and unified collaborative ranking. IEEE Signal Process Lett 99:1–1 Song K, Huang L, Gong A, Yan Y (2022) Multiple graph affinity interactive network and a variable illumination dataset for rgbt image salient object detection. IEEE Trans Circuits Syst Video Technol, 1–1. https://doi.org/10.1109/TCSVT.2022.3233131 Li C, Zhao N, Lu Y, Zhu C, Tang J (2017) Weighted sparse representation regularized graph learning for rgb-t object tracking. In: Proceedings of the 25th ACM International Conference on Multimedia, pp 1856–1864 Wu A, Zheng W-S, Yu H-X, Gong S, Lai J (2017) Rgb-infrared cross-modality person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5380–5389 Xu D, Ouyang W, Ricci E, Wang X, Sebe N (2017) Learning cross-modal deep representations for robust pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5363–5371 Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4293–4302 Li C, Wu X, Bao Z, Tang J (2017) Regle: spatially regularized graph learning for visual tracking. In: Proceedings of the 25th ACM International Conference on Multimedia, pp 252–260 Li C, Zhu C, Huang Y, Tang J, Wang L (2018) Cross-modal ranking with soft consistency and noisy labels for robust rgb-t tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 808–823 Li C, Zhu C, Zhang J, Luo B, Wu X, Tang J (2018) Learning local-global multi-graph descriptors for rgb-t object tracking. IEEE Trans Circuits Syst Video Technol 29(10):2913–2926 Mei J, Zhou D, Cao J, Nie R, Guo Y (2021) Hdinet: hierarchical dual-sensor interaction network for rgbt tracking. IEEE Sensors J 21(15):16915–16926 Zhu Y, Li C, Tang J, Luo B, Wang L (2021) Rgbt tracking by trident fusion network. IEEE Trans Circuits Syst Video Technol 32(2):579–592 Li C, Wu X, Zhao N, Cao Xn, Tang J (2018) Fusing two-stream convolutional neural networks for rgb-t object tracking. Neurocomputing 281:78–85 Long Li C, Lu A, Hua Zheng A, Tu Z, Tang J (2019) Multi-adapter rgbt tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp 5915–5926 Zhang X, Ye P, Peng S, Liu J, Gong K, Xiao G (2019) Siamft: An rgb-infrared fusion tracking method via fully convolutional siamese networks. IEEE Access 7:122122–122133 Zhu Y, Li C, Luo B, Tang J, Wang X (2019) Dense feature aggregation and pruning for rgbt tracking. In: Proceedings of the 27th ACM International Conference on Multimedia, pp 465–472 Li YD, Lai HC, Wang LJ, Jia ZH (2022) Multibranch adaptive fusion network for rgbt tracking. IEEE Sens J 22(7):7084–7093. https://doi.org/10.1109/jsen.2022.3154657 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30:1293–1302 Lu T, Wang Y, Zhang Y, Jiang J, Wang Z, Xiong Z (2022) Rethinking prior-guided face super-resolution: a new paradigm with facial component prior. IEEE Trans Neural Netw Learn Syst, 301–309 Wang Y, Lu T, Zhang Y, Wang Z, Jiang J, Xiong Z (2022) Faceformer: Aggregating global and local representation for face hallucination. IEEE Trans Circuits Syst Video Technol, 256–264 Lu T, Wang Y, Zhang Y, Wang Y, Wei L, Wang Z, Jiang J (2021) Face hallucination via split-attention in split-attention network. In: Proceedings of the 29th ACM International Conference on Multimedia, pp 501–5509 Guo M-H, Liu Z-N, Mu T-J, Hu S-M (2022) Beyond self-attention: External attention using two linear layers for visual tasks. IEEE Trans Pattern Anal Mach Intell, pp 32–43 Tang Z, Xu T, Wu X-J (2022) A survey for deep rgbt tracking. arXiv preprint arXiv:2201.09296 Conaire C, O‘Connor NE, Smeaton A (2008) Thermo-visual feature fusion for object tracking using multiple spatiogram trackers. Mach Vis Appl 19(5):483–494 Li C, Sun X, Wang X, Zhang L, Tang J (2017) Grayscale-thermal object tracking via multitask laplacian sparse representation. IEEE Trans Syst Man Cybernet Syst 47(4):673–681 Li C, Cheng H, Hu S, Liu X, Tang J, Lin L (2016) Learning collaborative sparse representation for grayscale-thermal tracking. IEEE Trans Image Process 25(12):5743–5756 Fang Z, Ye B, Yuan B, Wang T, Zhong S, Li S, Zheng J (2022) Angle prediction model when the imaging plane is tilted about z-axis. J Supercomput 78(17):18598–18615. https://doi.org/10.1007/s11227-022-04595-0 Li X, Lu R, Liu P, Zhu Z (2022) Graph convolutional networks with hierarchical multi-head attention for aspect-level sentiment classification. J Supercomput 78(13):14846–14865. https://doi.org/10.1007/s11227-022-04480-w Mittal P, Sharma A, Singh R, Sangaiah AK (2022) On the performance evaluation of object classification models in low altitude aerial data. J Supercomput 78(12):14548–14570. https://doi.org/10.1007/s11227-022-04469-5 Zhu Y, Li C, Tang J, Luo B, Wang L (2021) Rgbt tracking by trident fusion network. IEEE Trans Circuits Syst Video Technol 32(2):579–592 Zhang L, Danelljan M, Gonzalez-Garcia A, van de Weijer J, hahbaz Khan F (2019) Multi-modal fusion for end-to-end rgb-t tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp 324–336 Bhat G, Danelljan M, Gool LV, Timofte R (2019) Learning discriminative model prediction for tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6182–6191 Liu W, Liu W, Sun Y (2023) Visible-infrared dual-sensor fusion for single object tracking. IEEE Sens J, pp 121–1217 Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S (2010) An image is worth 16x16 words: transformers for image recognition at scale. arxiv 2020. arXiv preprint arXiv:2010.11929, 7538–7546 Wang W, Xie E, Li X, Fan D-P, Song K, Liang D, Lu T, Luo P, Shao L (2021) Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 568–578 Zheng M, Gao P, Zhang R, Li K, Wang X, Li H, Dong H (2020) End-to-end object detection with adaptive clustering transformer. arXiv preprint arXiv:2011.09315, 11286–11301 Choi J, Jin Chang H, Yun S, Fischer T, Demiris Y, Young Choi J (2017) Attentional correlation filter network for adaptive visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4807–4816 Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y (2018) Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 286–301 Zhang H, Zhang L, Zhuo L, Zhang J (2020) Object tracking in rgb-t videos using modal-aware attention network and competitive learning. Sensors 20(2):393–399 Li C, Liang X, Lu Y, Zhao N, Tang J (2019) Rgb-t object tracking: benchmark and baseline. Pattern Recogn 96:106977–106989 Li C, Zhao N, Lu Y, Zhu C, Tang J (2017) Weighted sparse representation regularized graph learning for rgb-t object tracking. In: Proceedings of the 25th ACM International Conference on Multimedia, pp 1856–1864 Kristan M, Matas J, Leonardis A, Felsberg M, Pflugfelder R, Kamarainen J-K, Cehovin Zajc L, Drbohlav O, Lukezic A, Berg A, et al (2019) The seventh visual object tracking vot2019 challenge results. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp 10260–10270 Tu Z, Lin C, Zhao W, Li C, Tang J (2021) M5l: multi-modal multi-margin metric learning for rgbt tracking. IEEE Trans Image Process 31:85–98 Xu Q, Mei Y, Liu J, Li C (2021) Multimodal cross-layer bilinear pooling for rgbt tracking. IEEE Trans Multimedia 24:567–580 Lu A, Qian C, Li C, Tang J, Wang L (2022) Duality-gated mutual condition network for rgbt tracking. IEEE Trans Neural Netw Learn Syst, pp 216–224 Xia W, Zhou D, Cao J, Liu Y, Hou R (2022) Cirnet: An improved rgbt tracking via cross-modality interaction and re-identification. Neurocomputing 493:327–339 Feng M, Su J (2022) Learning reliable modal weight with transformer for robust rgbt tracking. Knowl Based Syst 249:108945–108957 Huang Y, Li X, Lu R, Qi N (2023) Rgb-t object tracking via sparse response-consistency discriminative correlation filters. Infrared Phys Technol 128:104509–104523 Xiao X, Xiong X, Meng F, Chen Z (2023) Multi-scale feature interactive fusion network for rgbt tracking. Sensors 23(7):3410–3417 Mei J, Liu Y, Wang C, Zhou D, Nie R, Cao J (2022) Asymmetric global-local mutual integration network for rgbt tracking. IEEE Trans Instrument Measure 71:1–17 Li, C., Liu, L., Lu, A., Ji, Q., Tang, J.: Challenge-aware rgbt tracking. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXII 16, pp 222–237 (2020). Springer Zhang P, Zhao J, Bo C, Wang D, Lu H, Yang X (2021) Jointly modeling motion and appearance cues for robust rgb-t tracking. IEEE Trans Image Process 30:3335–3347 Zhang P, Wang D, Lu H, Yang X (2021) Learning adaptive attribute-driven representation for real-time rgb-t tracking. Int J Computer Vis 129:2714–2729 Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: Efficient convolution operators for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6638–6646

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA