Automatic Information Extraction From Student ID Card Images Using DB and VietOCR: A Case Study at a Vietnamese University: Online First: 29/04/2026

Journal of Technical Education Science - Trang - 2026

Bao-Khanh Hoang¹, Van-Hai Ngo¹, Xuan-Truong Tran¹, Dung Nguyen¹, Duc-Phuc Nguyen²

¹University of Sciences, Hue University, Vietnam

²University of Arts, Hue University, Vietnam

Tóm tắt

The development of an information extraction system from student ID card images plays an important role in the digitalization of student management. This study proposes a two-stage processing framework that integrates computer vision and deep learning techniques, in which MobileNetV3-Small is employed for student identification card image classification, while the Differentiable Binarization (DB) model and VietOCR are responsible for Vietnamese text detection and recognition, respectively. Experimental results on a student ID card image dataset show that the classification model achieves an accuracy of 99.40% with an AUC of 0.9996, while the DB-based text detection model attains an Hmean of 89.81% after data augmentation. For text recognition, the proposed system achieves over 99% character-level accuracy and up to 98.90% full-sequence accuracy. These results demonstrate the effectiveness and practical feasibility of the proposed system, which is further validated through a proof-of-concept offline attendance application. In addition, the system is designed with computational efficiency in mind, enabling deployment on resource-constrained devices without requiring continuous internet connectivity. The proposed framework can be readily adapted to other types of identification documents, providing a scalable and cost-effective solution for automated data acquisition in educational institutions.

Từ khóa

#MobileNetV3 #Differentiable Binarization #Text Detection #Vietnamese Text Recognition #VietOCR

Tài liệu tham khảo

E. Mukul and G. Büyüközkan, “Digital transformation in education: A systematic review of Education 4.0,” Technol. Forecast. Soc. Change, vol. 194, Art. no. 122664, 2023.

K. K. de S. Oliveira and R. A. C. De Souza, “Digital transformation towards Education 4.0,” Informatics in Education, vol. 21, no. 2, pp. 283–309, 2022.

A. A. Bilyalova, D. A. Salimova, and T. I. Zelenina, “Digital transformation in education,” in Proc. Int. Conf. Integrated Science, 2019, pp. 265–276.

J. Liang, D. Doermann, and H. Li, “Camera-based analysis of text and documents: A survey,” Int. J. Doc. Anal. Recognit., vol. 7, no. 2–3, pp. 84–104, 2005.

A. T. I. Mazumdar, N. N. Islam, and M. S. Hossain, “NFC-based mobile application for student attendance in institution of higher learning,” in Proc. ICAEEE, 2022, pp. 1–6.

M. Kumar, P. K. Samota, and M. K. Sharma, “Class attendance management system using NFC mobile devices,” Intell. Autom. Soft Comput., vol. 23, no. 2, pp. 243–250, 2017.

T. Karygiannis et al., Guidelines for Securing Radio Frequency Identification (RFID) Systems, NIST Special Publication 800-98, 2007.

C. Jin et al., “RFID technology, security vulnerabilities, and countermeasures,” in Cutting Edge Research Topics on Multiple Access Communications. London, U.K.: IntechOpen, 2009.

S. Kumar et al., “A comprehensive taxonomy of security and privacy issues in RFID,” Complex Intell. Syst., vol. 7, no. 4, pp. 1915–1943, 2021.

A. Howard et al., “Searching for MobileNetV3,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 1314–1324.

M. Liao, Z. Wan, C. Yao, K. Chen, and X. Bai, “Real-time scene text detection with differentiable binarization,” in Proc. AAAI Conf. Artif. Intell., vol. 34, no. 7, 2020, pp. 11474–11481.

P. B. C. Quoc, “VietOCR – Nhận dạng tiếng Việt sử dụng mô hình Transformer và AttentionOCR,” 2021. [Online]. Available: https://pbcquoc.github.io/vietocr/

A. V. Gayer, Y. S. Chernyshova, and V. V. Arlazarov, “Recognition of machine-readable zone in identity documents: A review,” IEEE Access, 2025.

R. Smith, “An overview of the Tesseract OCR engine,” in Proc. ICDAR, 2007, pp. 629–633.

Y. Xu et al., “LayoutLMv3: Pre-training for document AI with unified text and image masking,” arXiv:2204.08387, 2022.

G. Kim et al., “Donut: Document understanding transformer without OCR,” in Proc. ECCV, 2022, pp. 1–19.

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proc. CVPR, 2018, pp. 4510–4520.

X. Zhou et al., “EAST: An efficient and accurate scene text detector,” in Proc. CVPR, 2017, pp. 2642–2651.

S. Long et al., “TextSnake: A flexible representation for detecting text of arbitrary shapes,” in Proc. ECCV, 2018, pp. 20–36.

D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv:1409.0473, 2014.

B. Shi, X. Bai, and C. Yao, “An end-to-end trainable neural network for image-based sequence recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 11, pp. 2298–2304, 2017.

M. Li et al., “TrOCR: Transformer-based optical character recognition with pre-trained models,” in Proc. AAAI Conf. Artif. Intell., vol. 37, no. 11, 2023, pp. 13094–13102.

J. Deng et al., “ImageNet: A large-scale hierarchical image database,” in Proc. CVPR, 2009, pp. 248–255.

Y. Xu et al., “LayoutLM: Pre-training of text and layout for document image understanding,” in Proc. ACM SIGKDD, 2020, pp. 1192–1200.

K. Nguyen-Trong, “An end-to-end method to extract information from Vietnamese ID card images,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 3, 2022.

I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Adv. Neural Inf. Process. Syst., vol. 27, 2014.

P. Dhote, “Seq2Seq Encoder–Decoder LSTM Model,” Medium, 2020.

A. Vaswani et al., “Attention is all you need,” in Adv. Neural Inf. Process. Syst., vol. 30, 2017, pp. 5998–6008.

Viblo Asia, “Seq2Seq with Attention,” 2019. [Online]. Available: https://viblo.asia

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556, 2014.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích ảnh hưởng của các bài báo, công bố khoa học Việt Nam và Quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ SciBase

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Hệ thống hội thảo khoa học Việt Nam

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA

Thông tin liên hệ & hỗ trợ

Đơn vị chủ quản, phát triển và vận hành: Công ty Cổ phần Metis

Địa chỉ liên hệ: 26A Lê Đức Thọ, Phường Từ Liêm, Thành phố Hà Nội

Số giấy chứng nhận ĐKKD: 0109293202 cấp ngày 03/08/2020 tại Sở Kế hoạch và Đầu tư thành phố Hà Nội

Người quản lý và chịu trách nhiệm nội dung: Nguyễn Ngọc Sơn

Hotline: 0566.685.688

Email: [email protected]