Character-level arabic text generation from sign language video using encoder–decoder model

Displays - Tập 76 - Trang 102340 - 2023

Abdelbasset Boukdir¹, Mohamed Benaddy¹, Othmane El Meslouhi², Mustapha Kardouchi³, Moulay Akhloufi³

¹LabSI Laboratory, FSA/PFO, Ibn Zohr University, Ouarzazate, Morocco

²SARS Group, National School of Applied Sciences - Safi, Cadi Ayyad University, Morocco

³PRIME Group, Department of Computer Sciences, Université de Moncton, Moncton, Canada

Tài liệu tham khảo

Li, 2019, Visual to text: Survey of image and video captioning, IEEE Trans. Emerg. Top. Comput. Intell., 3, 297, 10.1109/TETCI.2019.2892755

S. Kafle, P. Yeung, M. Huenerfauth, Evaluating the Benefit of Highlighting Key Words in Captions for People who are Deaf or Hard of Hearing, in: The 21st International ACM SIGACCESS Conference on Computers and Accessibility, 2019, pp. 43–55.

Alsmadi, 2020, Content-based image retrieval using color, shape and texture descriptors and features, Arab. J. Sci. Eng., 45, 3317, 10.1007/s13369-020-04384-y

Zhou, 2018, A novel real-time video mosaic block detection based on intensity order and shape feature, 108062M

Islam, 2014, Color feature based video content extraction and its application for poster generation with relevance feedback, 197

Bodini, 2019, A review of facial landmark extraction in 2d images and videos using deep learning, Big Data Cogn. Comput., 3, 14, 10.3390/bdcc3010014

Plyer, 2016, Massively parallel lucas kanade optical flow for real-time video processing applications, J. Real-Time Image Process., 11, 713, 10.1007/s11554-014-0423-0

D. Tran, L. Bourdev, R. Fergus, L. Torresani, M. Paluri, Learning spatiotemporal features with 3d convolutional networks, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 4489–4497.

Hori, 2017, Early and late integration of audio features for automatic video description, 430

K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.

C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2818–2826.

Simonyan, 2013

Boukdir, 2022, Isolated video-based arabic sign language recognition using convolutional and recursive neural networks, Arab. J. Sci. Eng., 47, 2187, 10.1007/s13369-021-06167-5

Wu, 2017, Deep learning for video classification and captioning, 3

Pan, 2021, Chinese image caption of Inceptionv4 and double-layer GRUs based on attention mechanism, 1861,1

Zhao, 2021, A lightweight convolutional neural network for large-scale Chinese image caption, Optoelectron. Lett., 17, 361, 10.1007/s11801-021-0100-z

Liu, 2020, Chinese image caption generation via visual attention and topic modeling, IEEE Trans. Cybern.

Mishra, 2021, A Hindi image caption generation framework using deep learning, Trans. Asian Low-Resour. Lang. Inf. Process., 20, 1, 10.1145/3432246

Singh, 2021, An encoder-decoder based framework for hindi image caption generation, Multimedia Tools Appl., 80, 35721, 10.1007/s11042-021-11106-5

Mahadi, 2020, Adaptive attention generation for Indonesian image captioning, 1

Biswas, 2021, Improving german image captions using machine translation and transfer learning, 3

Daskalakis, 2018, Learning deep spatiotemporal features for video captioning, Pattern Recognit. Lett., 116, 143, 10.1016/j.patrec.2018.09.022

Yang, 2018, Video captioning by adversarial LSTM, IEEE Trans. Image Process., 27, 5600, 10.1109/TIP.2018.2855422

Xu, 2018, Dual-stream recurrent neural network for video captioning, IEEE Trans. Circuits Syst. Video Technol., 29, 2482, 10.1109/TCSVT.2018.2867286

Jin, 2019, Recurrent convolutional video captioning with global and local attention, Neurocomputing, 370, 118, 10.1016/j.neucom.2019.08.042

Pawade, 2019, Text caption generation based on lip movement of speaker in video using neural network, 313

Liu, 2020, Sibnet: Sibling convolutional encoder for video captioning, IEEE Trans. Pattern Anal. Mach. Intell.

D. Guo, S. Tang, M. Wang, Connectionist Temporal Modeling of Video and Language: a Joint Model for Translation and Sign Labeling, in: IJCAI, 2019, pp. 751–757.

Guo, 2019, Hierarchical recurrent deep fusion using adaptive clip summarization for sign language translation, IEEE Trans. Image Process., 29, 1575, 10.1109/TIP.2019.2941267

Tang, 2021, Graph-based multimodal sequential embedding for sign language translation, IEEE Trans. Multimed.

Wang, 2020, Sequence in sequence for video captioning, Pattern Recognit. Lett., 130, 327, 10.1016/j.patrec.2018.07.024

Vinodhini, 2020, A deep structured model for video captioning, Int. J. Gaming Comput.-Mediat. Simul. (IJGCMS), 12, 44, 10.4018/IJGCMS.2020040103

Nabati, 2020, Video captioning using boosted and parallel long short-term memory networks, Comput. Vis. Image Underst., 190, 10.1016/j.cviu.2019.102840

Hastie, 2009, Multi-class adaboost, Stat. Interface, 2, 349, 10.4310/SII.2009.v2.n3.a8

Nabati, 2020, Multi-sentence video captioning using content-oriented beam searching and multi-stage refining algorithm, Inf. Process. Manage., 57, 10.1016/j.ipm.2020.102302

K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, Bleu: a method for automatic evaluation of machine translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002, pp. 311–318.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA