Stemmer and phonotactic rules to improve n-gram tagger-based indonesian phonemicization

Suyanto Suyanto1, Andi Sunyoto2, Rezza Nafi Ismail1, Ema Rachmawati1, Warih Maharani1
1School of Computing, Telkom University, Bandung, Indonesia
2Faculty of Computer Science, Universitas Amikom Yogyakarta, Indonesia

Tài liệu tham khảo

Achanta, 2016, Analysis of sequence to sequence neural networks on grapheme to phoneme conversion task, International Joint Conference on Neural Networks (IJCNN), 2016, 2798, 10.1109/IJCNN.2016.7727552 Adriani, 2007, Stemming Indonesia: a confix-stripping approach, ACM Trans. Asian Language Inform. Process., 6, 1, 10.1145/1316457.1316459 Al-Daradkah, 2015, Automatic grapheme-to-phoneme conversion of Arabic text, Science and Information Conference (SAI), 2015, 468 Chen, H., 2020. English phonetic synthesis based on dfga g2p conversion algorithm, Vol. 1533, Institute of Physics Publishing.https://doi.org/10.1088/1742-6596/1533/3/032031. Emiru, E.D., Li, Y., Xiong, S., Fesseha, A., 2019. Speech recognition system based on deep neural network acoustic modeling for low resourced language-Amharic. In: ACM International Conference Proceeding Series, Association for Computing Machinery, pp. 141–145. Hadj Ali, 2020, Dnn-based grapheme-to-phoneme conversion for arabic text-to-speech synthesis, Int. J. Speech Technol., 23, 569, 10.1007/s10772-020-09750-7 Hlaing, A., Pa, W., 2019. Sequence-to-sequence models for grapheme to phoneme conversion on large myanmar pronunciation dictionary, Institute of Electrical and Electronics Engineers Inc..https://doi.org/10.1109/O-COCOSDA46868.2019.9041225. Ismail, R.N., Suyanto, S., 2020. Indonesian Graphemic Syllabification Using n -Gram Tagger with State-Elimination. In: 2020 8th International Conference on Information and Communication Technology (ICoICT).https://doi.org/10.1109/ICoICT49345.2020.9166368. Jyothi, 2017, Low-resource grapheme-to-phoneme conversion using recurrent neural networks, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Liu, 2020, Agreement on target-bidirectional recurrent neural networks for sequence-to-sequence learning, J. Artif. Intell. Res., 67, 581, 10.1613/jair.1.12008 Patil, 2019, Grapheme to phoneme conversion rules for hindi, J. Adv. Res. Dyn. Control Syst., 11, 1757 Peters, B., 2017. Massively Multilingual Neural Grapheme-to-Phoneme Conversion, in: the First Workshop on Building Linguistically Generalizable NLP Systems, pp. 19–26.https://doi.org/10.18653/v1/W17-5403. Rugchatjaroen, 2019, Efficient two-stage processing for joint sequence model-based thai grapheme-to-phoneme conversion, Speech Commun., 106, 105, 10.1016/j.specom.2018.12.003 Sar, 2019, Applying linguistic g2p knowledge on a statistical grapheme-to-phoneme conversion in khmer, Elsevier B.V., 161, 415 Shareghi, 2016, Richer Interpolative Smoothing Based on Modified Kneser-Ney Language Modeling, 944 Stan, A., 2019. Input encoding for sequence-to-sequence learning of romanian grapheme-to-phoneme conversion, Institute of Electrical and Electronics Engineers Inc.https://doi.org/10.1109/SPED.2019.8906639. Suyanto, S., 2019. Incorporating syllabification points into a model of grapheme-to-phoneme conversion. Int. J. Speech Technol. 22 (2), 459–470. https://doi.org/10.1007/s10772-019-09619-4. Suyanto, Hartati, S., Harjoko, A., 2016. Modified grapheme encoding and phonemic rule to improve PNNR-based indonesian G2P. Int. J. Adv. Comput. Sci. Appl. 7 (3).https://doi.org/10.14569/IJACSA.2016.070358. Suyanto, S., Hartati, S., Harjoko, A., Compernolle, D.V., 2016. Indonesian syllabification using a pseudo nearest neighbour rule and phonotactic knowledge. Speech Commun. 85, 109–118. https://doi.org/10.1016/j.specom.2016.10.009. Švec, 2018, On the use of grapheme models for searching in large spoken archives, 6259 Yolchuyeva, S., Nmeth, G., Gyires-Tth, B., 2019. Grapheme-to-phoneme conversion with convolutional neural networks, Applied Sciences (Switzerland) 9 (6), cited By 1.https://doi.org/10.3390/app9061143. Yolchuyeva, S., Nmeth, G., Gyires-Tth, B., 2019. Transformer based grapheme-to-phoneme conversion, vol. 2019-September, Int. Speech Commun. Assoc., pp. 2095–2099.https://doi.org/10.21437/Interspeech.2019-1954.