P2T: Pyramid Pooling Transformer for Scene Understanding

IEEE Transactions on Pattern Analysis and Machine Intelligence - Tập 45 Số 11 - Trang 12760-12771 - 2023

Yu-Huan Wu^1,2, Yun Liu³, Xin Zhan¹, Ming‐Ming Cheng²

¹Alibaba DAMO Academy, Hangzhou, Hangzhou, China

²TMCC, College of Computer Science, Nankai University, Tianjin, China

³Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore

Tóm tắt

Từ khóa

Tài liệu tham khảo

10.1201/9781420010749

Simonyan, Very deep convolutional networks for large-scale image recognition, Proc. Int. Conf. Learn. Represent.

10.1109/CVPR.2015.7298594

10.1109/CVPR.2016.90

10.1109/tpami.2019.2918284

10.1109/TPAMI.2019.2913372

Tan, EfficientNet: Rethinking model scaling for convolutional neural networks, Proc. Int. Conf. Mach. Learn., 6105

10.1109/tpami.2021.3134684

10.1109/TPAMI.2019.2938758

10.1007/s11263-015-0816-y

10.1007/978-3-319-10602-1_48

10.1007/s11263-009-0275-4

10.1109/CVPR.2016.350

10.1109/CVPR.2017.544

Vaswani, Attention is all you need, Proc. Adv. Neural Inform. Process. Syst., 6000

10.1007/978-3-030-58452-8_13

Zhu, 2020, Deformable DETR: Deformable transformers for end-to-end object detection

10.1109/iccv48922.2021.00147

Dosovitskiy, An image is worth 16x16 words: Transformers for image recognition at scale, Proc. Int. Conf. Learn. Represent.

10.1109/ICCV48922.2021.01172

10.1109/ICCV48922.2021.00061

10.1109/ICCV48922.2021.00986

10.1109/ICCV48922.2021.00675

Liu, 2021, Transformer in convolutional neural networks

10.1109/ICCVW54120.2021.00210

10.1109/CVPR.2019.00656

10.1109/CVPR.2017.634

Chu, Twins: Revisiting the design of spatial attention in vision transformers, Proc. Adv. Neural Inform. Process. Syst., 9355

10.1007/s41095-022-0274-8

10.3115/v1/W14-3302

10.1109/ICCV48922.2021.00009

10.1109/ICCV48922.2021.01204

Han, 2021, Demystifying local vision transformer: Sparse connectivity, weight sharing, and dynamic weight

10.1109/ICCV.2005.239

10.1109/CVPR.2006.68

10.1109/TPAMI.2015.2389824

10.1109/CVPR.2017.660

Howard, 2017, MobileNets: Efficient convolutional neural networks for mobile vision applications

10.1109/CVPR.2018.00474

10.1007/978-3-030-01264-9_8

10.1109/CVPR.2018.00716

10.1109/CVPR.2019.00293

10.1007/s10462-020-09825-6

10.1016/j.neucom.2016.12.038

10.1109/CVPR46437.2021.00542

10.1109/WACV48630.2021.00374

Hu, 2021, ISTR: End-to-end instance segmentation with transformers

Touvron, Training data-efficient image transformers distillation through attention, Proc. Int. Conf. Mach. Learn., 10347

10.1109/ICCV48922.2021.00060

Chu, 2021, Conditional positional encodings for vision transformers

Jiang, 2021, Token labeling: Training a 85.5% top-1 accuracy vision transformer with 56M parameters on ImageNet

Li, 2021, LocalViT: Bringing locality to vision transformers

10.1109/ICCV48922.2021.00062

10.1109/TPAMI.2017.2699184

10.1007/978-3-030-00934-2_3

10.1007/s11263-021-01465-9

10.1016/j.patcog.2020.107622

10.1109/CVPRW.2015.7301274

10.1007/978-3-030-01228-1_15

10.1016/j.ins.2020.02.067

10.1109/ICCV.2017.31

10.1109/CVPR.2018.00567

10.1109/tpami.2021.3140168

10.1109/TIP.2021.3065822

10.1109/CVPRW.2018.00133

10.1109/TCSVT.2019.2920407

10.1109/CVPR.2018.00337

10.1109/ICCV.2017.433

Ba, 2016, Layer normalization

Dong, 2021, Attention is not all you need: Pure attention loses rank doubly exponentially with depth

10.1109/ICCV.2019.00140

Hendrycks, 2016, Gaussian error linear units (GELUs)

10.1109/ICCV48922.2021.00299

10.1109/ICCV.2017.324

10.1109/TPAMI.2018.2844175

Loshchilov, Decoupled weight decay regularization, Proc. Int. Conf. Learn. Represent.

Glorot, Understanding the difficulty of training deep feedforward neural networks, Proc. Int. Conf. Artif. Intell. Statist., 249

Contributors, 2020, MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark

Chen, 2019, MMDetection: Open MMLab detection toolbox and benchmark

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA