Misogynoir: challenges in detecting intersectional hate
Tóm tắt
“Misogynoir” is a term that refers to the anti-Black forms of misogyny that Black women experience. To explore how current automated hate speech detection approaches perform in detecting this type of hate, we evaluated the performance of two state-of-the-art detection tools, HateSonar and Google’s Perspective API, on a balanced dataset of 300 tweets, half of which are examples of misogynoir and half of which are examples of supporting Black women and an imbalanced dataset of 3138 tweets of which 162 tweets are examples of misogynoir and 2976 tweets are examples of allyship tweets. We aim to determine if these tools flag these messages under any of their classifications of hateful speech (e.g. “hate speech”, “offensive language”, “toxicity” etc.). Close analysis of the classifications and errors shows that current hate speech detection tools are ineffective in detecting misogynoir. They lack sensitivity to context, which is an essential component for misogynoir detection. We found that tweets likely to be classified as hate speech explicitly reference racism or sexism or use profane or aggressive words. Subtle tweets without references to these topics are more challenging to classify. We find that the lack of sensitivity to context may make such tools not only ineffective but potentially harmful to Black women.
Tài liệu tham khảo
Bailey A (2018) On anger, silence, and epistemic injustice. Royal Instit Philos Suppl 84:93–115
Bailey M, Trudy R (2018) On misogynoir: citation, erasure, and plagiarism. Fem Med Stud 18(4):762–768
Barbour RS (2001) Checklists for improving rigour in qualitative research: a case of the tail wagging the dog? BMJ 322(7294):1115–1117
Bonds A (2020) Race and ethnicity ii: white women and the possessive geographies of white supremacy. Prog Hum Geogr 44(4):778–788
Cao R, Lee RK-W, Hoang T-A (2020) Deephate: Hate speech detection via multi-faceted text representations. In: 12th ACM conference on web science pp. 11–20
Chandra M, Pailla D, Bhatia H, Sanchawala A, Gupta M, Shrivastava M, Kumaraguru P (2021) “Subverting the Jewtocracy”: online antisemitism detection using multimodal deep learning. pp. 148–157. Association for Computing Machinery (ACM). Retrieved from https://arxiv-org.libezproxy.open.ac.uk/abs/2104.05947v3https://doi.org/10.1145/3447535.3462502
Collins PH (2019) Intersectionality as critical social theory. Duke University Press, USA
Crenshaw KW (2017) On intersectionality: essential writings. The New Press
Davidson T, Bhattacharya D, Weber I (2019) Racial bias in hate speech and abusive language detection datasets. arXiv preprint arXiv:1905.12516
Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. 11(1)
Douglass S, Mirpuri S, English D, Yip T (2016) They were just making jokes: ethnic/racial teasing and discrimination among adolescents. Cult Divers Ethnic Minority Psychol 22(1):69
Eddo-Lodge R (2020) Why i’m no longer talking to white people about race. Bloomsbury Publishing, UK
ElSherief M, Kulkarni V, Nguyen D, Wang WY, Belding E (2018) Hate lingo: A target-based linguistic analysis of hate speech in social media. In: Proceedings of the international AAAI conference on web and social media Vol. 12
Epstein R, Blake J, González T (2017) Girlhood interrupted: The erasure of black girls’ childhood. Available at SSRN 3000695
Farrell T, Fernandez M, Novotny J, Alani H (2019) Exploring misogyny across the manosphere in reddit. In: Proceedings of the 10th ACM conference on web science pp. 87–96
Fereday J, Muir-Cochrane E (2006) Demonstrating rigor using thematic analysis: a hybrid approach of inductive and deductive coding and theme development. Int J Qual Methods 5(1):80–92
Fitzsimons A (2022) Intersectional identities and machine learning: illuminating language biases in twitter algorithms. Hicss pp. 1–10)
Fortuna P, Nunes S (2018) A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR) 51(4):1–30
Gomez R, Gibert J, Gómez L, Karatzas D (2019) Exploring hate speech detection in multimodal publications. CoRR, abs/1910.03814 . Retrieved from http://arxiv.org/abs/1910.03814arXiv:1910.03814
Gorrell G, Bakir ME, Greenwood MA, Roberts I, Bontcheva K (2019) Race and religion in online abuse towards UK politicians. arXiv preprint arXiv:1910.00920
Gorrell G, Bakir ME, Roberts I, Greenwood MA, Bontcheva K (2020) Which politicians receive abuse? four factors illuminated in the UK general election 2019. EPJ Data Sci 9(1):18
Jurgens D, Chandrasekharan E, Hemphill L (2019) A just and comprehensive strategy for using NLP to address online abuse. arXiv preprint arXiv:1906.01738
Kim J, Wohn DY, Cha M (2022) Understanding and identifying the use of emotes in toxic chat on twitch. Online Social Netw Media 27:100180
Kim JY, Ortiz C, Nam S, Santiago S, Datta V (2020) Intersectional bias in hate speech and abusive language datasets. Retrieved from https://arxiv-org.libezproxy.open.ac.uk/abs/2005.05921v3arXiv:2005.05921, https://doi.org/10.48550/arxiv.2005.05921
Kshirsagar R, Cukuvac T, McKeown K, McGregor S (2018) Predictive embeddings for hate speech detection on twitter. arXiv preprint arXiv:1809.10644
Kumar D, Mason J, Bailey M, Gage P, Consolvo KS, Bursztein E, Thomas K (n.d.) This paper is included in the Proceedings of the 17th symposium on usable privacy and security. Designing toxic content classification for a diversity of perspectives designing toxic content classification for a diversity of perspectives. Retrieved from https://data.esrg.stanford.edu/study/toxicity-perspectives
Kwarteng J, Perfumi SC, Farrell T, Fernandez M (2021) Misogynoir: public online response towards self-reported misogynoir. In: Proceedings of the 2021 IEEE/ACM international conference on advances in social networks analysis and mining pp. 228–235
MacAvaney S, Yao H-R, Yang E, Russell K, Goharian N, Frieder O (2019) Hate speech detection: challenges and solutions. PLoS ONE 14(8):e0221152
Madden S, Janoske M, Winkler RB, Edgar AN (2018) Mediated misogynoir: intersecting race and gender in online harassment. Mediating misogyny, Springer, pp. 71–90
Magu R, Joshi K, Luo J (2017) Detecting the hate code on social media. In: Proceedings of the international AAAI conference on web and social media Vol. 11
Maronikolakis A, Baader P, Schütze H (2022) Analyzing hate speech data along racial, gender and intersectional axes. pp. 1– 7. Association for Computational Linguistics (ACL). Retrieved from https://arxiv-org.libezproxy.open.ac.uk/abs/2205.06621v2, https://doi.org/10.18653/v1/2022.gebnlp-1.1
Mathew B, Saha P, Yimam SM, Biemann C, Goyal P, Mukherjee A (2021) HateXplain: A benchmark dataset for explainable hate speech detection. In: 35th AAAI conference on artificial intelligence, AAAI 2021 Vol. 17A, pp. 14867–14875. Retrieved from https://github.com/punyajoy/HateXplain
Mayorga-Gallo S (2019) The white-centering logic of diversity ideology. Am Behav Sci 63(13):1789–1809
McGee EO, Bentley L (2017) The troubled success of black women in stem. Cogn Instr 35(4):265–289
Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th international conference on world wide web pp. 145–153
Noble S, Roberts S (2019) Technological elites, the meritocracy, and postracial myths in silicon valley. Duke University Press, UK
Nuru AK, Arendt CE (2019) Not so safe a space: women activists of color’s responses to racial microaggressions by white women allies. South Commun J 84(2):85–98
Oluo I (2019) So you want to talk about race. Hachette, UK
Park JH, Shin J, Fung P (2018) Reducing gender bias in abusive language detection. pp. 2799–2804. Retrieved from arXiv:abs/1808.07231
Rankin YA, Thomas JO (2020) The intersectional experiences of black women in computing. In: Proceedings of the 51st ACM technical symposium on computer science education pp. 199–205
Rodríguez-Sánchez F, Carrillo-de Albornoz J, Plaza L (2020) Automatic classification of sexism in social networks: an empirical study on twitter data. IEEE Access 8:219563–219576
Saleem HM, Dillon KP, Benesch S, Ruths D (2017) A web of hate: Tackling hateful speech in online social spaces. arXiv preprint arXiv:1709.10159
Sap M, Card D, Gabriel S, Choi Y, Smith NA (2019) The risk of racial bias in hate speech detection. In: Proceedings of the 57th annual meeting of the association for computational linguistics pp. 1668–1678
Schmidt A, Wiegand M (2017) A survey on hate speech detection using natural language processing. In: Proceedings of the 5th international workshop on natural language processing for social media pp. 1–10
Seliya N, Khoshgoftaar TM, Van Hulse J (2009) A study on the relationships of classifier performance metrics. In: 2009 21st IEEE international conference on tools with artificial intelligence pp. 59–66
Silva L, Mondal M, Correa D, Benevenuto F, Weber I (2016) Analyzing the targets of hate in online social media. In: 10th international AAAI conference on web and social media
Tan YC, Celis LE (2019) Assessing social and intersectional biases in contextualized word representations. Advances in neural information processing systems pp. 13230–13241
Trudy (2014) Explanation of misogynoir. Gradient Lair
Waseem Z (2016) Are You a Racist or Am I Seeing Things? annotator influence on hate speech detection on twitter. In: NLP + CSS 2016 - emnlp 2016 workshop on natural language processing and computational social science, proceedings of the workshop pp. 138–142. Retrieved from www.spacy.io, https://doi.org/10.18653/v1/w16-5618
Waseem Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop pp. 88–93. Stroudsburg, PA, USA: Association for Computational Linguistics. Retrieved from http://aclweb.org/anthology/N16-2013, https://doi.org/10.18653/v1/N16-2013
Zannettou S, ElSherief M, Belding E, Nilizadeh S, Stringhini G (2020) Measuring and characterizing hate speech on news websites. In: 12th ACM conference on web science pp. 125–134