A hybrid statistical and deep learning based technique for Persian part of speech tagging

Springer Science and Business Media LLC - Tập 4 - Trang 35-43 - 2020

Sara Besharati¹, Hadi Veisi¹, Ali Darzi², Seyed Habib Hosseini Saravani³

¹Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran

²Department of General Linguistics, University of Tehran, Tehran, Iran

³Computational Linguistics Group, Sharif University of Technology, Tehran, Iran

Tóm tắt

In part of speech (POS) tagging, the main challenge is to predict the right tags for both in-vocabulary (IV) and out-of-vocabulary (OOV) words. Therefore, artificial neural networks, such as multi-layer perceptron (MLP) and long short term memory (LSTM), which seem to be efficient because of their high generality capability, have been applied to POS tagging to overcome this challenge. In this research, using word vectors as the input of MLP and LSTM neural networks, we do POS tagging in Persian language and compare the results of the neural models with a second-order hidden Markov model (HMM) which in fact is our benchmark. To investigate the effect of the number of hidden layers, we use both a single-layer and a two-layer MLP and LSTM neural network. Also, we have applied a bidirectional LSTM neural network to investigate the effect of a bidirectional learning algorithm on Persian POS tagging. The results obtained from different models in this research show that neural models have a far better performance in predicting the correct POS tags for OOV words, which can be due to their higher generality. Therefore, we have proposed a hybrid model which is a combination of the HMM and a single-layer bidirectional LSTM model as an innovative model in POS tagging. This hybrid model is successful in improving both HMM and neural models, increasing the accuracy to 97.29%.

Tài liệu tham khảo

Brill, E.: A simple rule-based part of speech tagger. In: Proceedings of the third conference on applied natural language processing, pp. 152–155. Association for Computational Linguistics (1992) Brants, T.: TnT: a statistical part-of-speech tagger. In: Proceedings of the sixth conference on applied natural language processing, pp. 224–231. Association for Computational Linguistics (2000) Søgaard, A.: Semisupervised condensed nearest neighbor for part-of-speech tagging. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers, vol. 2, pp. 48–-52. Association for Computational Linguistics (2011) Schmid, H.: Part-of-speech tagging with neural networks. In: Proceedings of the 15th conference on Computational linguistics, vol. 1, pp. 172–176. Association for Computational Linguistics (1994) Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003) Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990) Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994) Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) Fonseca, E.R., Rosa, J.L.G., Aluísio, S.M.: Evaluating word embeddings and a revised corpus for part-of-speech tagging in Portuguese. J. Braz. Comput. Soc. 21(1), 2 (2015) Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning, pp. 160–167. ACM, New York (2008) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). arXiv:1301.3781 Rojas, R.: Neural networks: a systematic introduction, pp.151–184. Springer, Berlin, New York (1996) Olah, C.: Understanding LSTM networks. IOP Publishing PhysicsWeb (2015). https://colah.github.io/posts/2015-08-Understanding-LSTMs/. Accessed Aug 2015 Nakamura, M., Shikano, M.: A study of English word category prediction based on neutral networks. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 731–734. IEEE (1989) Tortajada, S., Castro, M.J., Pla, F.: Part-of-Speech tagging based on artificial neural networks. In: 2nd Language & Technology Conf. Proc, pp. 414–418 (2005) Santos, C.N.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1818–1826 (2014) Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: Thirteenth annual conference of the international speech communication association (2012) Yao, K., Zweig, G., Hwang, M.Y., Shi, Y., Yu, D. Recurrent neural networks for language understanding. In: Interspeech, pp. 2524–2528 (2013) Sundermeyer, M., Alkhouli, T., Wuebker, J., Ney, H.: Translation modeling with bidirectional recurrent neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 14–25 (2014) Wang, P., Qian, Y., Soong, F.K., He, L., Zhao, H.: Part-of-speech tagging with bidirectional long short-term memory recurrent neural network (2015). arXiv:1510.06168 Saravani, S.H., Bahrani, M., Veisi, H., Besharati, S.: Persian language modeling using recurrent neural networks. In: 2018 9th International Symposium on Telecommunications (IST), pp. 207–210. IEEE (2018) Plank, B., Søgaard, A., Goldberg, Y.: Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss (2016). arXiv:1604.05529 Eghbalzadeh, H., Hosseini, B., Khadivi, S., Khodabakhsh, A.: Persica: a Persian corpus for multi-purpose text mining and natural language processing. In: 6th International Symposium on Telecommunications (IST), pp. 1207–1214. IEEE (2012) Bijankhan, M.: The role of the corpus in writing a grammar: an introduction to a software. Iran. J. Linguist. 19(2), 48–67 (2004) Mikolov, T., Karafiát, M., Burget, L., Černocký, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh annual conference of the international speech communication association (2010) TensorFlow: IOP Publishing PhysicsWeb (2018). https://www.tensorflow.org/. Accessed 30 Jul 2018

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA