BERT syntactic transfer: A computational experiment on Italian, French and English languages

Computer Speech & Language - Tập 71 - Trang 101261 - 2022
Raffaele Guarasci1, Stefano Silvestri1, Giuseppe De Pietro1, Hamido Fujita2, Massimo Esposito1
1Institute for High Performance Computing and Networking of National Research Council of Italy (ICAR-CNR), via Pietro Castellino 111, 80131, Naples, Italy
2Iwate Prefecture University, Takizawa, Iwate, Japan

Tài liệu tham khảo

Abeillé, 2020, Extraction from subjects: Differences in acceptability depend on the discourse function of the construction, Cognition, 204, 10.1016/j.cognition.2020.104293 Alexiadou, 2006, On the properties of VSO and VOS orders in greek and Italian: A study on the syntax information structure interface, 1 Alicante, 2012, A treebank-based study on the influence of Italian word order on parsing performance, 1985 Bates, 1974 Bates, 1982, Functional constraints on sentence processing: A cross-linguistic study, Cognition, 11, 245, 10.1016/0010-0277(82)90017-8 Bauer, 2009, Word order, New Perspect. Hist. Lat. Syntax, 1, 241 Bjerva, 2021, Does typological blinding impede cross-lingual sharing?, 480 Blake, 1988, Basic word order. Functional principles, J. Linguist., 24, 213, 10.1017/S0022226700011646 Bosco, 2013, Converting Italian treebanks: Towards an Italian stanford dependency treebank, 61 Brunato, D., Dell’Orletta, F., 2017. On the order of words in Italian: a study on genre vs complexity. In: Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017), pp. 25–31. Buchholz, 2006, CoNLL-x shared task on multilingual dependency parsing, 149 Buridant, 2000 Burzio, 1986 Camacho, 2013 Camacho-Collados, 2015, A unified multilingual semantic representation of concepts, 741 Candito, 2014, Deep syntax annotation of the sequoia french treebank, 2298 Catelli, 2020, Crosslingual named entity recognition for clinical de-identification applied to a COVID-19 Italian data set, Appl. Soft Comput., 97, 10.1016/j.asoc.2020.106779 Chen, C., Ng, V., 2016. Chinese zero pronoun resolution with deep neural networks. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 778–788. Chi, 2020, Finding universal grammatical relations in multilingual BERT, 5564 Chomsky, 1957 Chomsky, 1981, Lectures on government and binding (dordrecht: Foris), Stud. Gener. Gramm., 9 Chomsky, 1995 Chung, 2010, Factors affecting the accuracy of Korean parsing, 49 Clark, 2019, What does BERT look at? An analysis of BERT’s attention, 276 Comrie, 1989 Conneau, 2020, Unsupervised cross-lingual representation learning at scale, 8440 Conneau, 2018, What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties, 2126 Conneau, 2019, Cross-lingual language model pretraining, 7057 Conneau, 2020, Emerging cross-lingual structure in pretrained language models, 6022 Croft, 2009, Methods for finding language universals in syntax, 145 Cruz, 2018, Exploring spanish corpora for portuguese coreference resolution, 290 Davis, 2020, Recurrent neural network language models always learn english-like relative clause attachment, 1979 De Santo, 2019, Testing a minimalist grammar parser on Italian relative clause asymmetries, 93 Declerck, 2020, Unified syntax in the bilingual mind, Psychon. Bull. Rev., 27, 149, 10.3758/s13423-019-01666-x Devlin, 2019, BERT: Pre-training of deep bidirectional transformers for language understanding, 4171 Dhar, 2020 Di Eugenio, 1996 Dorr, 1994, From syntactic encodings to thematic roles: Building lexical entries for interlingual MT, Mach. Transl., 9, 221 Dryer, 2005, Order of degree word and adjective, World Atlas Lang. Struct. [Online], 370 Du, 2020, Commonsense knowledge enhanced memory network for stance classification, IEEE Intell. Syst., 35, 102, 10.1109/MIS.2020.2983497 Eisner, 1996, Three new probabilistic models for dependency parsing: An exploration, 340 Esuli, 2020, Cross-lingual sentiment quantification, IEEE Intell. Syst., 35, 106, 10.1109/MIS.2020.2979203 Ferguson, 1982, Simplified registers and linguistic theory, Except. Lang. Linguist., 49, 66 Ferrández, A., Peral, J., 2000. A computational approach to zero-pronouns in Spanish. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, pp. 166–172. Futrell, 2015, Quantifying word order freedom in dependency corpora, 91 Ganin, 2015, Unsupervised domain adaptation by backpropagation, vol. 37, 1180 Gass, 1984, A review of interlanguage syntax: Language transfer and language universals, Lang. Learn., 34, 115, 10.1111/j.1467-1770.1984.tb01007.x Gilligan, 1989 Godard, 1988 Gopal, 2017, Zero pronouns and their resolution in sanskrit texts, 255 Gries, 2017, Structural priming within and across languages: A corpus-based perspective, Biling.: Lang. Cogn., 20, 235, 10.1017/S1366728916001085 Grigorova, D., 2013. An algorithm for zero pronoun resolution in Bulgarian. In: Proceedings of the 14th International Conference on Computer Systems and Technologies, 2013, pp. 276–283. Guillaume, 2019, Conversion et améliorations de corpus du Français annotés en Universal Dependencies, Traitement Autom. Langues, 60, 71 Hajmohammadi, 2015, Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabelled samples, Inform. Sci., 317, 67, 10.1016/j.ins.2015.04.003 Hartsuiker, 2016, Cross-linguistic structural priming in multilinguals: Further evidence for shared syntax, J. Mem. Lang., 90, 14, 10.1016/j.jml.2016.03.003 Hartsuiker, 2004, Is syntax separate or shared between languages? Cross-linguistic syntactic priming in spanish-english bilinguals, Psychol. Sci., 15, 409, 10.1111/j.0956-7976.2004.00693.x Hauer, 2020 Hayashi, 2020, Cluster-based zero-shot learning for multivariate data, J. Ambient Intell. Humaniz. Comput. Hewitt, 2019, Designing and interpreting probes with control tasks, 2733 Hewitt, 2019, A structural probe for finding syntax in word representations, 4129 Jawahar, 2019, What does BERT learn about the structure of language?, 3651 Karthikeyan, 2020, Cross-lingual ability of multilingual BERT: an empirical study Kolachina, 2019, Bootstrapping UD treebanks for delexicalized parsing, 15 Kondratyuk, 2019, 75 languages, 1 model: Parsing universal dependencies universally, 2779 Kozhevnikov, 2013, Cross-lingual transfer of semantic role labeling models, 1190 Kübler, 2009 Kuncoro, 2018, LSTMs can learn syntax-sensitive dependencies well, but modeling structure makes them better, 1426 Lahousse, 2012, Word order in french, spanish and Italian: A grammaticalization account, Folia Linguist., 46, 387, 10.1515/flin.2012.014 Lakretz, 2020, What limits our capacity to process nested long-range dependencies in sentence comprehension?, Entropy, 22, 446, 10.3390/e22040446 Li, 2020, User reviews: Sentiment analysis using lexicon integrated two-channel CNN–LSTM family models, Appl. Soft Comput., 94, 10.1016/j.asoc.2020.106435 Linzen, 2019, What can linguistics and deep learning contribute to each other? response to pater, Language, 95, e99, 10.1353/lan.2019.0015 Linzen, 2021, Syntactic structure from deep learning, Annu. Rev. Linguist., 7, 195, 10.1146/annurev-linguistics-032020-051035 Linzen, 2016, Assessing the ability of LSTMs to learn syntax-sensitive dependencies, Trans. Assoc. Comput. Linguist., 4, 521, 10.1162/tacl_a_00115 Liu, 2010, Dependency direction as a means of word-order typology: A method based on dependency treebanks, Lingua, 120, 1567, 10.1016/j.lingua.2009.10.001 Liu, 2012, Quantitative typological analysis of romance languages, Pozn. Stud. Contemp. Linguist., 48, 597 Liu, 2017, Dependency distance: A new perspective on syntactic patterns in natural languages, Phys. Life Rev., 21, 171, 10.1016/j.plrev.2017.03.002 Loebell, 2003, Structural priming across languages, Linguistics, 41, 791, 10.1515/ling.2003.026 Majid, 2015, Semantic systems in closely related languages, Lang. Sci., 49, 1, 10.1016/j.langsci.2014.11.002 Marchello-Nizia, 2006 Marvin, 2019, Targeted syntactic evaluation of language models, Proc. Soc. Comput. Linguist. (SCiL), 373 McCoy, 2020, Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks, Trans. Assoc. Comput. Linguist., 8, 125, 10.1162/tacl_a_00304 McWhorter, 2001, The worlds simplest grammars are creole grammars, Linguist. Typol., 5, 125 Newmeyer, 2008, Universals in syntax, Linguist. Rev., 25, 35 Nivre, 2017, Universal dependency evaluation, 86 Nivre, 2016, Universal dependencies v1: A multilingual treebank collection, 1659 Nivre, J., de Marneffe, M.-C., Ginter, F., Hajic, J., Manning, C.D., Pyysalo, S., Schuster, S., Tyers, F., Zeman, D., 2020a. Universal dependencies v2: An evergrowing multilingual treebank collection. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 4034–4043. Nivre, 2020, Universal dependencies v2: An evergrowing multilingual treebank collection, 4034 Pamungkas, 2020, Misogyny detection in Twitter: a multilingual and cross-domain study, Inf. Process. Manage., 57, 10.1016/j.ipm.2020.102360 Peters, 2018, Deep contextualized word representations, 2227 Pires, 2019, How multilingual is multilingual BERT?, 4996 Raganato, 2018, An analysis of encoder representations in transformer-based machine translation, 287 Ranta, 2009, Grammar development in GF, 57 Rasooli, 2017, Cross-lingual syntactic transfer with limited resources, Trans. Assoc. Comput. Linguist., 5, 279, 10.1162/tacl_a_00061 Ravishankar, 2021, Attention can reflect syntactic structure (if you let it), 3031 Rizzi, 1982 Rizzi, 1986, Null objects in Italian and the theory of pro, Linguist. Inq., 17, 501 Rönnqvist, 2019, Is multilingual BERT fluent in language generation?, 29 Rothman, 2009, Understanding the nature and outcomes of early bilingualism: Romance languages as heritage languages, Int. J. Biling., 13, 155, 10.1177/1367006909339814 Seddah, 2013, Overview of the SPMRL 2013 shared task: A cross-framework evaluation of parsing morphologically rich languages, 146 Shin, 2009, Syntactic processing in Korean–english bilingual production: Evidence from cross-linguistic structural priming, Cognition, 112, 175, 10.1016/j.cognition.2009.03.011 Siddhant, 2020, Evaluating the cross-lingual effectiveness of massively multilingual neural machine translation, 8854 Silveira, 2014, A gold standard dependency corpus for english, 2897 Silvestri, 2020, Exploit multilingual language model at scale for ICD-10 clinical text classification, 1 Simi, 2014, Less is more? Towards a reduced inventory of categories for training a parser for the Italian stanford dependencies, 83 Søgaard, 2018, On the limitations of unsupervised bilingual dictionary induction, 778 Solodow, 2010 Song, 2020, ZPR2: Joint zero pronoun recovery and resolution using multi-task learning and BERT, 5429 Spence Green, C.S., Manning, C.D., 2009. NP subject detection in verb-initial Arabic clauses. In: Proceedings of the Third Workshop on Computational Approaches to Arabic Script-Based Languages (CAASL3), Vol. 112, p. 123. Sukthanker, 2020, Anaphora and coreference resolution: A review, Inf. Fusion, 59, 139, 10.1016/j.inffus.2020.01.010 Tenney, 2019, BERT rediscovers the classical NLP pipeline, 4593 Tenney, 2019, What do you learn from context? Probing for sentence structure in contextualized word representations Thierry, 2007, Brain potentials reveal unconscious translation during foreign-language comprehension, Proc. Natl. Acad. Sci., 104, 12530, 10.1073/pnas.0609927104 Thompson, B., Roberts, S., Lupyan, G., 2018. Quantifying semantic alignment across languages. In: Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSCi 2018), Madison, WI, USA, 2018, pp. 2551–2556. Tsarfaty, R., Nivre, J., Andersson, E., 2012. Cross-framework evaluation for statistical parsing. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 44–54. Tsarfaty, 2010, Statistical parsing of morphologically rich languages (SPMRL) what, how and whither, 1 Vaswani, 2017, Attention is all you need, 5998 Vennemann, T., 1974. Topics, subjects and word order: from SXV to SVX via TVX. In: Historical Linguistics: Proceedings of the First International Congress of Historical Linguistics, Edinburgh, Scotland, pp. 339–376. Vulić, 2019, Do we really need fully unsupervised cross-lingual embeddings?, 4407 Wang, 2018, Translating pro-drop languages with reconstruction models, 4937 Wang, 2017, A novel and robust approach for pro-drop language translation, Mach. Transl., 31, 65, 10.1007/s10590-016-9184-9 Warstadt, 2019, Neural network acceptability judgments, Trans. Assoc. Comput. Linguist., 7, 625, 10.1162/tacl_a_00290 Whaley, 1996 Wu, 2020, Perturbed masking: Parameter-free probing for analyzing and interpreting BERT, 4166 Wu, 2019, Beto, bentz, becas: The surprising cross-lingual effectiveness of BERT, 833 Zeman, 2017, CoNLL 2017 shared task: Multilingual parsing from raw text to universal dependencies, 1