Computer morphology for investigations of a variable text

L. Yu. Kovrigina1
1St. Petersburg National Research University of Information Technologies, Mechanics, and Optics, St. Petersburg, Russia

Tóm tắt

This paper describes the principles and results of the development of freely distributed computer morphology for the morphological analysis of medieval manuscript texts in Russian, as well as the principles for the development of computer morphology for the corpus of a variable text and presents the examples of NooJ dictionaries and grammars for processing texts with a large content of graphic and grammatical variants.

Tài liệu tham khảo

Zakharov, V.P., Korpusnaya lingvistika (Corpus Linguistics), St. Petersburg: Izd. S.-Peterb. Gos. Univ., 2015. Manuscript: Slavic written heritage. http://manuscripts.ru/. Accessed December 1, 2016. Koval’, S.A., Lingvisticheskie problemy komp’yuternoi morfologii (Linguistic Problems of Computer Morphology), St. Petersburg: Izd. S.-Peterb. Gos. Univ., 2005. Alekseeva, E.L., Lavrent’ev, A.M., Azarova, I.V., and Zakharova, L.A., Layout of the corpus of Old Russian texts, in Trudy mezhdunarodnoi konferentsii Korpusnaya lingvistika 2004, 11–14 oktyabrya 2004 g. (Proc. Int. Conf. Corpus Linguistics 2004), St. Petersburg, 2004, pp. 16–24. Amoia, M. and Martinez, J.M., Using comparable collections of historical texts for building a diachronic dictionary for spelling normalization, Proceedings of the 7thWorkshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH@ACL 2013, Sofia, 2013, pp. 84–89. van Dalen-Oskam, K.H., Authors, scribes, and scholars: Detecting scribal variation and editorial intervention via authorship attribution methods, in Analysis of Ancient and Medieval Texts and Manuscripts: Digital Approaches, Andrews, T.L. and Macé, C., Eds., Turnhout: Brepols, 2014. van Dalen-Oskam, K.H., In praise of the variant analysis tool. A computational approach to Medieval literature, in Texts, Transmissions, Receptions: Modern Approaches to Narratives, Lardinois, A., Levie, S., Hoeken, H., and Lüthy, C., Eds., Leiden: Brill, 2015, pp. 35–54. Zampieri, M., Malmasi, Sh., and Dras, M., Modeling language change in historical corpora: The case of Portuguese, Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC, Portorož, 2016. Filippov, K.A., Lingvistika teksta: Kurs lektsii (Linguistics of the Text: The Course of Lectures), St. Petersburg, 2003. Likhachev, D.S., Tekstologiya. Na materiale russkoi literatury X–XVII v. (Textology. On the Material of Russian Literature of X–XVII Centuries), Moscow, 2001. Korona, V.V., Poeziya Anny Akhmatovoi: Poetika avtovariatsii (Poetry of Anna Akhmatova: Poetics of Autovariations), Yekaterinburg, 1999. Propp, V.Ya., Morfologiya volshebnoi skazki (Morphology of a Fairy Tale), Moscow, 2003. Domanskii, Yu.V., Variability and interpretation of the text (the paradigm of non-classical artistry), Extended Abstract of Doctoral (Philol.) Dissertation, Moscow, 2006. Dmitriev, L.A., Skazaniya i povesti o Kulikovskoi bitve (Legends and Stories about the Kulikovo Battle), Leningrad, 1982. SKAT: St. Petersburg Corpus of Hagiographic Texts. http://project.phil.spbu.ru/scat/page.php?page=project. Accessed December 20, 2016. Gerd, A.S., Alekseeva, E.L., Azarova, I.V., and Zakharova, L.A., The electronic corpus of texts on the monuments of Old Russian hagiographic literature, Nauchno-Tekh. Inf., Ser. 2, 2004, no. 9, pp. 16–20. Kovrigina, L.Yu., Non-Gaussian modeling of the lexical- statistical structure of the variational text (on the example of Legends about the Mamai Slaughter), Cand. Sci. (Philol.) Dissertation, St. Petersburg, 2015. htttp://spbu.ru/disser2/246/disser/covrigina_dis.pdf. Silberztein, M., NooJ Manual, 2003. http://www. nooj4nlp.net. Silberztein, M., Formalizing Natural Languages: The NooJ Approach, Wiley, 2016. Roche, E. and Schabes, Y., Deterministic part-ofspeech tagging with finite-state transducers, Comput. Linguist., 1995, vol. 21, no. 2, pp. 227–253. Linden, K., Axelson, E., Drobac, S., Hardwick, S., Silfverberg, S., and Pirinen, T.A., HFST—Framework for Compiling and Applying Morphologies, Systems and Frameworks for Computational Morphology—Second International Workshop, SFCM 2011, Zurich, 2011, pp. 67–85. Linden, K., Axelson, E., Drobac, S., Hardwick, S., Silfverberg, S., and Pirinen, T.A., Using HFST for creating computational linguistic applications, Comput. Linguist. Appl., 2013, pp. 3–25. Lukanin, A.V., Avtomaticheskaya obrabotka estestvennogo yazyka (Automatic Processing of Natural Language), Chelyabinsk: Izd. tsentr YuUrGU, 2011. Ivanov, V.V., Istoricheskaya grammatika russkogo yazyka: Ucheb. dlya filol. spets. un-tov i ped. in-tov (Historical Grammar of the Russian language: Textbook for Philological Universities and Pedagogical Institutes), Moscow: Prosveshchenie, 1983. Kolesov, V.V., Istoriya russkogo yazyka: Ucheb. posobie (History of the Russian Language: Textbook), St. Petersburg, 2005. Selishchev, A.M., Staroslavyanskii yazyk (Old Slavonic Language), Moscow: Editorial URSS, 2001.