Arabic vs. English: Comparative Statistical Study

Arabian Journal for Science and Engineering - Tập 39 Số 2 - Trang 809-820 - 2014

Fahad Alotaiby¹, Salah G. Foda¹, Ibrahim A. Al-Kharashi²

¹Department of Electrical Engineering, College of Engineering, King Saud University, Riyadh, Saudi Arabia

²Computer Research Institute, King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia

Tóm tắt

Từ khóa

Tài liệu tham khảo

Yang, S.; Zhu, H.; Apostoli, A.; Cao, P.: N-gram statistics in English and Chinese: similarities and differences. In: Proceedings of IEEE International Conference on Semantic Computing, Irvine, pp. 454–460 (2007)

Al-Kadi I.: Study of information-theoretic properties of Arabic based on word entropy and Zipf’s law. J. King Saud Univ. 10, 1–14 (1996)

Attia, M.: Arabic tokenization system. In: Proceedings of the 2007 Workshop on Computational Approaches To Semitic Languages: Common Issues and Resources. Association for Computational Linguistics, Prague, pp. 65–72 (2007)

Heintz, I.: Arabic language modeling with finite state transducers. In: Proceedings of the ACL-08: HLT Student Research Workshop, Companion Volume, Columbus, pp. 37–42 (2008)

Buckwalter, T.: Buckwalter Arabic Morphological Analyzer Version 2.0. Linguistic Data Consortium (LDC) catalogue number LDC2004L02, Philadelphia, USA, ISBN 1-58563-324-0(2004)

Rashwan M., Badrashiny M., Attia M., Abdou S., Rafea A.: A stochastic Arabic diacritizer based on a hybrid of factorized and unfactorized textual features. IEEE Trans. Audio Speech Lang. Process. (TASLP) 19(1), 166–175 (2011)

Shaalan, K.; Abo Bakr, H.; Ziedan, I.: A hybrid approach for building Arabic diacritizer. In: Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages, Athens, pp. 27–35 (2009)

Kadri, Y.; Nie, J.Y.: Effective stemming for Arabic information retrieval. In: Proceedings of the challenge of Arabic for NLP/MT Conference. The British Computer Society. London (2006)

Majdi, S.; Eric, A.: Comparative evaluation of Arabic language morphological analysers and stemmers. In: Proceedings of COLING 2008 22nd International Conference on Computational Linguistics, Manchester (2008)

Rogati, M.; McCarley, S.; Yang, Y.: Unsupervised learning of Arabic stemming using a parallel corpus. In: Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data, Singapore, pp. 113–118 (2003)

Buckwalter, T.: Issues in Arabic orthography and morphology analysis. In: Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages, Geneva (2004)

Graff, D.: Arabic Gigaword Third Edition. Linguistic Data Consortium, Philadelphia (2007)

Graff, D.; Kong, J.; Chen, K.; Maeda, K.: English Gigaword Third Edition. Linguistic Data Consortium, Philadelphia (2007)

Diab, M.; Hacioglu, K.; Jurafsky, D.: Automatic tagging of Arabic text: from raw text to base phrase chunks. 5th Meeting of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies Conference (HLT-NAACL04), Boston (2004)

Habash, N.; Rambow, O.; Roth, R.: MADA+TOKAN: A toolkit for Arabic tokenization, diacritization, morphological disambiguation, POS tagging, stemming and lemmatization. In: Proceedings of the Second International Conference on Arabic Language Resources and Tools, Cairo, pp. 102–109 (2009)

Diab, M.: Second generation AMIRA tools for Arabic processing: fast and robust tokenization, POS tagging, and phrase chunking. In: Proceedings of the Second International Conference on Arabic Language Resources and Tools, Cairo, pp. 285–288 (2009)

Alghoneim K., Alotaiby F.: Syllable based labeling for continuous Arabic speech recognition. J. Appl. Sci. Comput. 10(2), 77–86 (2003)

Manning, C.; Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)

Maamouri, M., Bies, A.; Kulick, S.; Gaddeche, F.; Mekki, W.: Arabic Treebank: Part 3(a) v. 2.6. Linguistic Data Consortium, Philadelphia, Catalog ID: LDC2007E65 (2007)

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích ảnh hưởng của các bài báo, công bố khoa học Việt Nam và Quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ SciBase

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Hệ thống hội thảo khoa học Việt Nam

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA

Thông tin liên hệ & hỗ trợ

Đơn vị chủ quản, phát triển và vận hành: Công ty Cổ phần Metis

Địa chỉ liên hệ: 26A Lê Đức Thọ, Phường Từ Liêm, Thành phố Hà Nội

Số giấy chứng nhận ĐKKD: 0109293202 cấp ngày 03/08/2020 tại Sở Kế hoạch và Đầu tư thành phố Hà Nội

Người quản lý và chịu trách nhiệm nội dung: Nguyễn Ngọc Sơn

Hotline: 0566.685.688

Email: [email protected]