Spatio-Temporal Two-stage Fusion for video question answering

Computer Vision and Image Understanding - Tập 237 - Trang 103821 - 2023
Feifei Xu1, Yitao Zhu1, Chun Wang1, Yangze Cao1, Zheng Zhong1, Xiongmin Li2
1Shanghai University of Electric Power, No. 1851 , Hucheng Ring Road, Pudong New Area, Shanghai 201306, China
2Cognizant Technology Solutions U.S. Corporation, 211 Quality Circle College Station, TX 77845, United States

Tài liệu tham khảo