Learning from masked analogies between sentences at multiple levels of formality

Liyan Wang1, Yves Lepage1
1Graduate School of IPS, Waseda University, Kitakyushu, Japan

Tóm tắt

This paper explores the inference of sentence analogies not restricted to the formal level. We introduce MaskPrompt, a prompt-based method that addresses the analogy task as masked analogy completion. This enables us to fine-tune, in a lightweight manner, pre-trained language models on the task of reconstructing masked spans in analogy prompts. We apply constraints which are approximations of the parallelogram view of analogy to construct a corpus of sentence analogies from textual entailment sentence pairs. In the constructed corpus, sentence analogies are characterized by their level of being formal, ranging from strict to loose. We apply MaskPrompt on this corpus and compare MaskPrompt with the basic fine-tuning paradigm. Our experiments show that MaskPrompt outperforms basic fine-tuning in solving analogies in terms of overall performance, with gains of over 2% in accuracy. Furthermore, we study the contribution of loose analogies, i.e., analogies relaxed on the formal aspect. When fine-tuning with a small number of them (several hundreds), the accuracy on strict analogies jumps from 82% to 99%. This demonstrates that loose analogies effectively capture implicit but coherent analogical regularities. We also use MaskPrompt with different schemes on masked content to optimize analogy solutions. The best masking scheme during fine-tuning is to mask any term: it exhibits the highest robustness in accuracy on all tested equivalent forms of analogies.

Tài liệu tham khảo

Nagao, M.: A framework of a mechanical translation between japanese and english by analogy principle. Artif. Human Intell. 351–354 (1984) Lepage, Y., Denoual, E.: Purest ever example-based machine translation: detailed presentation and assessment. Mach. Transl. 19(3), 251–282 (2005). https://doi.org/10.1007/s10590-006-9010-x Diallo, A., Zopf, M., Fürnkranz, J.: Learning analogy-preserving sentence embeddings for answer selection. In: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pp. 910–919. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/K19-1085 Elayeb, B., Chouigui, A., Bounhas, M., Khiroun, O.B.: Automatic Arabic text summarization using analogical proportions. Cogn. Comput. 12(5), 1043–1069 (2020). https://doi.org/10.1007/s12559-020-09748-y Miclet, L., Bayoudh, S., Delhay, A.: Analogical dissimilarity. J. Artif. Int. Res. 32(1), 793–824 (2008) Gladkova, A., Drozd, A., Matsuoka, S.: Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. In: Proceedings of the NAACL Student Research Workshop, pp. 8–15. Association for Computational Linguistics, San Diego, California (2016). https://doi.org/10.18653/v1/N16-2002. https://www.aclweb.org/anthology/N16-2002 Levy, O., Goldberg, Y.: Linguistic regularities in sparse and explicit word representations. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pp. 171–180. Association for Computational Linguistics, Ann Arbor, Michigan (2014). https://doi.org/10.3115/v1/W14-1618 Murena, P.-A., Cornuéjols, A., Dessalles, J.-L.: Opening the parallelogram: considerations on non-euclidean analogies. In: Cox, M.T., Funk, P., Begum, S. (eds.) Case-Based Reasoning Research and Development, pp. 597–611. Springer, Cham (2018) Peterson, J.C., Chen, D., Griffiths, T.L.: Parallelograms revisited: exploring the limitations of vector space models for simple analogies. Cognition 205, 104440 (2020). https://doi.org/10.1016/j.cognition.2020.104440 Wang, L., Lepage, Y.: Masked prompt learning for formal analogies beyond words. In: IARML@IJCAI (2022) Guu, K., Hashimoto, T.B., Oren, Y., Liang, P.: Generating sentences by editing prototypes. Trans. Assoc. Comput. Linguistics 6, 437–450 (2018) Lepage, Y.: Semantico-formal resolution of analogies between sentences. In: Proceedings of the 9th Language & Technology Conference – Human Language Technologies as a Challenge for Computer Science and Linguistics, pp. 57–61 (2019) Lepage, Y.: Solving analogies on words: an algorithm. In: COLING 1998 vol. 1: The 17th International Conference on Computational Linguistics (1998) Wang, L., Lepage, Y.: Vector-to-sequence models for sentence analogies. In: ICACSIS 2020, pp. 441–446 (2020). IEEE Zhu, X., de Melo, G.: Sentence analogies: linguistic regularities in sentence embeddings. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 3389–3400. International Committee on Computational Linguistics, Barcelona, Spain (Online) (2020). https://doi.org/10.18653/v1/2020.coling-main.300 Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020) Ushio, A., Espinosa Anke, L., Schockaert, S., Camacho-Collados, J.: BERT is to NLP what AlexNet is to CV: can pre-trained language models identify analogies? In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (vol. 1: Long Papers), pp. 3609–3624. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.acl-long.280 Gao, T., Fisch, A., Chen, D.: Making pre-trained language models better few-shot learners. In: ACL-IJCNLP (vol. 1: Long Papers), pp. 3816–3830. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.acl-long.295 Logan IV, R., Balazevic, I., Wallace, E., Petroni, F., Singh, S., Riedel, S.: Cutting down on prompts and parameters: simple few-shot learning with language models. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 2824–2835. Association for Computational Linguistics, Dublin, Ireland (2022). https://doi.org/10.18653/v1/2022.findings-acl.222 Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., Zettlemoyer, L.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.703 Lepage, Y.: Analogy and formal languages. Electron. Note Theor. Comput. Sci. 53, 180–191 (2004). https://doi.org/10.1016/S1571-0661(05)82582-4. Proceedings of the joint meeting of the 6th Conference on Formal Grammar and the 7th Conference on Mathematics of Language Marelli, M., Menini, S., Baroni, M., Bentivogli, L., Bernardi, R., Zamparelli, R.: A SICK cure for the evaluation of compositional distributional semantic models. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 216–223. European Language Resources Association (ELRA), Reykjavik, Iceland (2014). http://www.lrec-conf.org/proceedings/lrec2014/pdf/363_Paper.pdf Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: EMNLP-IJCNLP, pp. 3982–3992. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-1410. https://aclanthology.org/D19-1410 Kitaev, N., Klein, D.: Constituency parsing with a self-attentive encoder. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (vol. 1: Long Papers), pp. 2676–2686. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1249 Fam, R., Lepage, Y.: Tools for the production of analogical grids and a resource of n-gram analogical grids in 11 languages. In: LREC 2018, Miyazaki, Japan (2018). https://www.aclweb.org/anthology/L18-1171 Mikolov, T., Yih, W.-t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, Georgia, pp. 746–751 (2013). https://aclanthology.org/N13-1090