Một phương pháp chính xác để tạo mô tả hình ảnh cho người mù bằng cách sử dụng mạng nơ-ron nguyên tử tích cực mở rộng
Tóm tắt
Từ khóa
#tự động sinh mô tả hình ảnh #người mù #mô hình AI #thị giác máy tính #xử lý ngôn ngữ tự nhiên #học sâu #trích xuất đặc trưng #mạng nơ-rôn tích cực mở rộngTài liệu tham khảo
Al-Muzaini HA, Al-Yahya TN, Benhidour H (2018) Automatic Arabic image captioning using RNN-LST M-based language model and CNN. Int J Adv Comput Sci Appl 9(6):67–73
Amritkar C, Jabade V (2018) Image caption generation using deep learning technique. In 2018 fourth international conference on computing communication control and automation (ICCUBEA). IEEE, Pune, pp 1–4
Bai S, An S (2018) A survey on automatic image caption generation. Neurocomputing 311:291–304
Bigham JP, Lin I, Savage S (2017) The effects of not knowing what You Don’t know on web accessibility for blind web users. In proceedings of the 19th international ACM SIGACCESS conference on computers and accessibility, 101-109
Deng Z, Jiang Z, Lan R, Huang W, Luo X (2020) Image captioning using dense net network and adaptive attention. Signal Process Image Commun 85:1–9
Geng, W, Han F, Lin J, Zhu L, Bai J, Wang S, He L, Xiao Q, Lai Z (2018) Fine-grained grocery product recognition by one-shot learning. In Proceedings of the 26th ACM international conference on Multimedia, pp 1706–1714
Giraud S, Thérouanne P, Steiner DD (2018) Web accessibility: filtering redundant and irrelevant information improves website usability for blind users. International Journal of Human-Computer Studies 111:23–35
Guinness D, Cutrell E, Morris MR (2018) Caption crawler: enabling reusable alternative text descriptions using reverse image search. In proceedings of the 2018 CHI conference on human factors in computing systems, Montréal, QC, Canada, pp 1–11
Hossain MDZ, Sohel F, Shiratuddin MF, Laga H (2019) A comprehensive survey of deep learning for image captioning. ACM Computing Surveys (CsUR) 51(6):1–36
Iwamura K, Kasahara JYL, Moro A, Yamashita A, Asama H (2021) Image captioning using motion-CNN with object detection. Sensors 21(4):1–13
Jund P, Abdo N, Eitel A, Burgard W (2016) The freiburg groceries dataset. arXiv preprint arXiv:1611.05799
Khurram I, Fraz MM, Shahzad M, Rajpoot NM (2021) Dense-captionnet: a sentence generation architecture for fine-grained description of image semantics. Cogn Comput 13(3):595–611
Kim D-J, Choi J, Oh T-H, Kweon IS (2019) Image captioning with very scarce supervised data: adversarial semi-supervised learning approach arXiv preprint arXiv:1909.02201
Klasson M, Zhang C, Kjellström H (2019) A hierarchical grocery store image dataset with visual and semantic labels. In 2019 IEEE winter conference on applications of computer vision (WACV), 491-500
Kuber R, Yu W, Strain P, Murphy E, McAllister G (2020) Assistive multimodal interfaces for improving web accessibility. UMBC Information Systems Department Collection
Leo M, Carcagnì P, Distante C (2021) A systematic investigation on end-to-end deep recognition of grocery products in the wild. In 2020 25th international conference on pattern recognition (ICPR), IEEE, 7234-7241
Loganathan K, Kumar RS, Nagaraj V, John TJ (2020) CNN & LSTM using python for automatic image captioning. Materials Today: Proceedings, CNN & LSTM using python for automatic image captioning, pp 1–5
MacLeod H, Bennett CL, Morris MR, Cutrell E (2017) Understanding blind people’s experiences with computer-generated captions of social media images. In proceedings of the 2017 CHI conference on human factors in computing systems, 5988-5999
Makav B, Kılıç V (2019) A new image captioning approach for visually impaired people. In 2019 11th international conference on electrical and electronics engineering (ELECO), IEEE, 945-949
Melas-Kyriazi L, Rush AM, Han G (2018) Training for diversity in image paragraph captioning. In proceedings of the 2018 conference on empirical methods in natural language processing, 757-761
Sadeghi D, Shoeibi A, Ghassemi N, Moridian P, Khadem A, Alizadehsani R, Teshnehlab M, Gorriz JM, Nahavandi S (2021) An overview on artificial intelligence techniques for diagnosis of schizophrenia based on magnetic resonance imaging modalities: methods, challenges, and future works. arXiv preprint arXiv:2103.03081
Sehgal S, Sharma J, Chaudhary N (2020) Generating image captions based on deep learning and natural language processing. In 2020 8th international conference on reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), IEEE, 165–169
Sharma G, Kalena P, Malde N, Nair A, Parkar S (2019) Visual image caption generator using deep learning. In 2nd international conference on advances in Science & Technology (ICAST)
Shoeibi A, Khodatars M, Alizadehsani R, Ghassemi N, Jafari M, Moridian P, Khadem A et al (2020) Automated detection and forecasting of covid-19 using deep learning techniques: a review. arXiv preprint arXiv:2007.10785:1–20
Shoeibi A, Khodatars M, Jafari M, Moridian P, Rezaei M, Alizadehsani R, Khozeimeh F, Gorriz JM, Heras J, Panahiazar M, Nahavandi S, Acharya UR (2021) Applications of deep learning techniques for automated multiple sclerosis detection using magnetic resonance imaging: a review. Comput Biol Med 136:104697
Shoeibi A, Sadeghi D, Moridian P, Ghassemi N, Heras J, Alizadehsani R, Khadem A, Kong Y., Nahavandi S., Zhang Y.D., Gorriz J.M. (2021) Automatic diagnosis of schizophrenia in EEG signals using CNN-LSTM models. Frontiers in Neuroinformatics 15
Shoeibi A, Ghassemi N, Khodatars M, Moridian P, Alizadehsani R, Zare A, Khosravi A, Subasi A, Acharya UR, Gorriz JM (2022) Detection of epileptic seizures on EEG signals using ANFIS classifier, autoencoders and fuzzy entropies. Biomedical Signal Processing and Control 73:103417
Singh A, Singh TD, Bandyopadhyay S (2021) An encoder-decoder based framework for hindi image caption generation. Multimedia tools and applications, 1-20
Song H, Zhu J, Jiang Y (2020) avtmNet: adaptive visual-text merging network for image captioning. Comput Electr Eng 84:1–12
Wei Y, Tran S, Xu S, Kang B, Springer M (2020) Deep learning for retail product recognition: challenges and techniques. Comput Intell Neurosci 1–23
Wu S, Wieland J, Farivar O, Schiller J (2017) Automatic alt-text: computer-generated image descriptions for blind users on a social network service. In proceedings of the 2017 ACM conference on computer supported cooperative work and social computing, 1180–1192
Xiao F, Gong X, Zhang Y, Shen Y, Li J, Gao X (2019) DAA: dual LSTMs with adaptive attention for image captioning. Neurocomputing 364:322–329
Yang M-S, Nataliani Y (2017) Robust-learning fuzzy c-means clustering algorithm with unknown number of clusters. Pattern Recogn 71:45–59
Yang M, Liu J, Shen Y, Zhao Z, Chen X, Wu Q, Li C (2020) An Ensemble of Generation-and Retrieval-Based Image Captioning with dual generator generative adversarial network. IEEE Trans Image Process 29:9627–9640
You Q, Jin H, Wang Z, Fang C, Luo J (2016) Image captioning with semantic attention. In proceedings of the IEEE conference on computer vision and pattern recognition, 4651–4659
Yu N, Hu X, Song B, Yang J, Zhang J (2018) Topic-oriented image captioning based on order-embedding. IEEE Trans Image Process 28(6):2743–2754