Colexification Networks Encode Affective Meaning

Affective Science - Tập 2 Số 2 - Trang 99-111 - 2021
Anna Di Natale1, Max Pellert1, David García1
1Center for Medical Statistics, Informatics and Intelligent Systems, Medical Univeristy of Vienna, Inffeldgasse 16c/I, Graz, 8010, Austria

Tóm tắt

AbstractColexification is a linguistic phenomenon that occurs when multiple concepts are expressed in a language with the same word. Colexification patterns are frequently used to estimate the meaning similarity between words, but the hypothesis that these are related is still missing direct empirical validation at scale. Here, we show for the first time that words linked by colexification patterns capture similar affective meanings. Using pre-existing translation data, we extend colexification databases to cover much longer word lists. We achieve this with an unsupervised method of affective lexicon extension that uses colexification network data to interpolate the affective ratings of words that are not included in the original lexicon. We find positive correlations between network-based estimates and empirical affective ratings, which suggest that colexification networks contain information related to affective meanings. Finally, we compare our network method with state-of-the-art machine learning, trained on a large corpus, and show that our simple linguistics-informed unsupervised algorithm yields comparable performance with high explainability. These results show that it is possible to automatically expand affective norms lexica to cover exhaustive word lists when additional data are available, such as in colexification networks.

Từ khóa


Tài liệu tham khảo

Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: an open source software for exploring and manipulating networks. In International AAAI conference on weblogs and social media association for the advancement of artificial intelligence. CA. USA.

Bestgen, Y., & Vincze, N. (2012). Checking and bootstrapping lexical norms by means of word similarity indexes. Behavior Research Methods, 44(4), 998–1006.

Bradley, M. M., & Lang, P. J. (1999). Affective norms for english words (anew): Instruction manual and affective ratings. Technical report, Technical report C-1, the center for research in psychophysiology, University of Florida, 1999.

Fontaine, J. R., Scherer, K. R., Roesch, E. B., & Ellsworth, P. C. (2007). The world of emotions is not two-dimensional. Psychological Science, 18(12), 1050–1057.

François, A. (2008). Semantic maps and the typology of colexification. In From polysemy to semantic change: Towards a typology of lexical semantic associations, (Vol. 106 p. 163).

Hu, Y. (2005). Efficient, high-quality force-directed graph drawing. Mathematica Journal, 10(1), 37–71.

Jackson, J. C., Watts, J., Henry, T. R., List, J. -M., Forkel, R., Mucha, P. J., Greenhill, S. J., Gray, R. D., & Lindquist, K. A. (2019). Emotion semantics show both cultural variation and universal structure. Science, 366(6472), 1517–1522.

List, J. -M., Greenhill, S. J., Anderson, C., Mayer, T., Tresoldi, T., & Forkel, R. (2018). Clics2: an improved database of cross-linguistic colexifications assembling lexical data with the help of cross-linguistic data formats. Linguistic Typology, 22(2), 277–306.

List, J. M., Rzymski, C., Greenhill, S., Schweikhard, N., Pianykh, K., Tjuka, A., Wu, M. -S., Hundt, C., Tresoldi, T., & Forkel, R. (eds.) (2020). Concepticon 2.4.0. Max Planck Institute for the Science of Human History, Jena.

List, J. -M., Terhalle, A., & Urban, M. (2013). Using network approaches to enhance the analysis of cross-linguistic polysemies. In Proceedings of the 10th international conference on computational semantics (IWCS 2013)–Short Papers (pp. 347– 353).

Mandera, P., Keuleers, E., & Brysbaert, M. (2015). How useful are corpus-based methods for extrapolating psycholinguistic variables? The Quarterly Journal of Experimental Psychology, 68(8).

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013).

Mohammad, S. (2018). Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 english words. In Proceedings of the 56th annual meeting of the association for computational linguistics.

Nordhoff, S., & Hammarström, H. (2011). Glottolog/langdoc: Defining dialects, languages, and language families as collections of resources. In First international workshop on linked science 2011-In conjunction with the international semantic web conference (ISWC) (p. 2011).

Osgood, C. E. (1971). Exploration in semantic space: a personal diary. Journal of Social Issues, 27(4), 5–64.

Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957) The measurement of meaning. Champaign: University of Illinois press.

Pollock, L. (2018). Statistical and methodological problems with concreteness and other semantic variables: a list memory experiment case study. Behavior Research Methods, 50(3), 1198–1216.

Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161.

Rzymski, C., Tresoldi, T., Greenhill, S. J., Wu, M. -S., Schweikhard, N. E., Koptjevskaja-Tamm, M., Gast, V., Bodt, T. A., Hantgan, A., Kaiping, G. A., & et al. (2020). The database of cross-linguistic colexifications, reproducible analysis of cross-linguistic polysemies. Scientific Data, 7(1), 1–12.

Warriner, A. B., Kuperman, V., & Brysbaert, M. (2013). Norms of valence, arousal, and dominance for 13,915 english lemmas. Behavior Research Methods, 45(4), 1191–1207.

Youn, H., Sutton, L., Smith, E., Moore, C., Wilkins, J. F., Maddieson, I., Croft, W., & Bhattacharya, T. (2016). On the universal structure of human lexical semantics. Proceedings of the National Academy of Sciences, 113(7), 1766–1771.