A hybrid approach for the detection and monitoring of people having personality disorders on social networks

Social Network Analysis and Mining - Tập 12 - Trang 1-17 - 2022
Mourad Ellouze1, Lamia Hadrich Belguith1
1ANLP Research Group, MIRACL Laboratory, FSEGS, University of Sfax, Sfax, Tunisia

Tóm tắt

Research in the medical field does not stop evolving. This evolution obliges doctors to be up-to-date in order to well manage every situation that may occur with their patients. However, the medical field is very sensitive and requires a great deal of precision, all of that poses a major problem. Consequently, there is a recourse to computer science, to resolve all of these issues. In this context, we propose in this paper an architecture, taking advantage of artificial intelligence (AI) and text mining techniques to: (i) identify individuals with personality disorder from their textual production on social networks by classifying their set of tweets into distinct classes representing respectively the presence, the category and the type of the disease and (ii) guarantee personalized monitoring by filtering inappropriate tweets according to patient’s circumstance. The first phase was achieved by taking advantage of a deep neuronal approach that benefits of: (i) CNN layers for features extraction from the textual part, (ii) two LSTM layers to preserve long-term dependencies between different lexical units, (iii) SVM classifier to detect the sick person using the dependency links found from the previous layer. The second phase was accomplished by applying a hybrid approach that combined linguistic and statistical techniques in order to filter inappropriate tweets according to the state of each patient. Following the evaluation of our approach, we acquire an F-measure rate equivalent to 84% for the detection of personality disorder, 64% for the detection of the type of disease and 70% for the task of filtering inappropriate content. The obtained results are motivating and may encourage researchers to improve them in view of the interest and the importance of this research axis.

Tài liệu tham khảo

Ahmad N, Siddique J (2017) Personality assessment using Twitter tweets. Proc Comput Sci 112:1964–1973 AlAjlan SA, Saudagar AKJ (2021) Machine learning approach for threat detection on social media posts containing Arabic text. Evolut Intell 14(2):811–822 An G, Levitan SI, Hirschberg J, Levitan R (2018) Deep personality recognition for deception detection. In: INTERSPEECH, pp 421–425 Astuti FA (2021) Antisocial behavior monitoring services of Indonesian public Twitter using machine learning. In: Proceedings of the international conference on data science and official statistics, pp 224–232 Baik J, Lee K, Lee S, Kim Y, Choi J (2016) Predicting personality traits related to consumer behavior using SNS analysis. New Rev Hypermedia Multimed 22(3):189–206 Bakarov A (2018) A survey of word embeddings evaluation methods, arXiv preprint arXiv:1801.09536 Baumgartl H, Dikici F, Sauter D, Buettner R (2020) Detecting antisocial personality disorder using a novel machine learning algorithm based on electroencephalographic data. In: PACIS, p 48 Bird S (2006) NLTK: the natural language toolkit. In: Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, pp 69–72 Bleidorn W, Hopwood CJ (2019) Using machine learning to advance personality assessment and theory. Personal Soc Psychol Rev 23(2):190–203 Celli F, Lepri B (2018) Is big five better than MBTI? A personality computing challenge using Twitter data. In: CLiC-it Cer D, Yang Y, Kong S-y, Hua N, Limtiaco N, John RS, Constant N, Guajardo-Cespedes M, Yuan S, Tar C, et al. (2018) Universal sentence encoder, arXiv preprint arXiv:1803.11175 Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357 Chen Y, Zhang Z (2018) Research on text sentiment analysis based on CNNs and SVM. In: 2018 13th IEEE conference on industrial electronics and applications (ICIEA), IEEE, pp 2731–2734 Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537 Comito C, Pizzuti C, Procopio N (2016) Online clustering for topic detection in social data streams. In: 2016 IEEE 28th international conference on tools with artificial intelligence (ICTAI), IEEE, pp 362–369 Comito C, Forestiero A, Pizzuti C (2019) Word embedding based clustering to detect topics in social media. In: 2019 IEEE/WIC/ACM international conference on web intelligence (WI), IEEE, pp 192–199 Comito C (2021) How COVID-19 information spread in us the role of Twitter as early indicator of epidemics. IEEE Trans Services Comput 15(3):1193–1205 Dahiru T (2008) P-value, a true test of statistical significance? A cautionary note. Ann Ib Postgrad Med 6(1):21–26 Dilrukshi I, De Zoysa K, Caldera A (2013) Twitter news classification using SVM. In: 2013 8th international conference on computer /science & education, IEEE, pp 287–291 Ellouze 2021, Mechti S, Belguith LH (2021) Approach based on ontology and machine learning for identifying causes affecting personality disorder disease on Twitter. In: International conference on knowledge science, engineering and management, Springer, pp. 659–669 Ellouze M, Mechti S, Belguith LH (2020) Automatic profile recognition of authors on social media based on hybrid approach. Procedia Comput Sci 176:1111–1120 Feng F, Yang Y, Cer D, Arivazhagan N, Wang W (2020) Language-agnostic bert sentence embedding, arXiv preprint arXiv:2007.01852 Fernandes ER, de Carvalho AC, Yao X (2019) Ensemble of classifiers based on multiobjective genetic sampling for imbalanced data. IEEE Trans Knowl Data Eng 32(6):1104–1115 González-Gallardo CE, Montes A, Sierra G, Núnez-Juárez JA, Salinas-López AJ, Ek J (2015) tweets classification using corpus dependent tags, character and POS N-grams. In: CLEF working notes Graves A (2012) Long short-term memory. In: Supervised sequence labelling with recurrent neural networks. Springer, pp 37–45 Hall M, Caton S (2017) Am I who I say I am? Unobtrusive self-representation and personality recognition on Facebook. PloS One 12(9):e0184417 Hofmann M, Klinkenberg R (2016) RapidMiner: Data mining use cases and business analytics applications. CRC Press, Boca Raton Holtzman NS, Tackman AM, Carey AL, Brucks MS, Küfner AC, Deters FG, Back MD, Donnellan MB, Pennebaker JW, Sherman RA et al (2019) Linguistic markers of grandiose narcissism: a LIWC analysis of 15 samples. J Lang Soc Psychol 38(5–6):773–786 Hoogman M, Bralten J, Hibar DP, Mennes M, Zwiers MP, Schweren LS, van Hulzen KJ, Medland SE, Shumskaya E, Jahanshad N et al (2017) Subcortical brain volume differences in participants with attention deficit hyperactivity disorder in children and adults: a cross-sectional mega-analysis. Lancet Psychiatry 4(4):310–319 Ishaq A, Sadiq S, Umer M, Ullah S, Mirjalili S, Rupapara V, Nappi M (2021) Improving the prediction of heart failure patients’ survival using SMOTE and effective data mining techniques. IEEE Access 9:39707–39716 Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences, arXiv preprint arXiv:1404.2188 Kõlves K, Värnik A, Schneider B, Fritze J, Allik J (2006) Recent life events and suicide: a case-control study in Tallinn and Frankfurt. Soc Sci Med 62(11):2887–2896 Krasnowska-Kieraś K, Wróblewska A (2019) Empirical linguistic study of sentence embeddings. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp. 5729–5739 Kumar V, Sundaram S (2022) Offline Text-independent writer Identification based on word level data, arXiv preprint arXiv:2202.10207 Lin H, Jia J, Qiu J, Zhang Y, Shen G, Xie L, Tang J, Feng L, Chua T-S (2017) Detecting stress based on social interactions in social networks. IEEE Trans Knowl Data Eng 29(9):1820–1833 Mbarek A, Jamoussi S, Charfi A, Hamadou AB (2019) Suicidal profiles detection in Twitter. In: WEBIST, pp 289–296 Ombabi AH, Ouarda W, Alimi AM (2020) Deep learning CNN-LSTM framework for Arabic sentiment analysis using textual information shared in social networks. Soc Netw Anal Min 10(1):1–13 Organization WH et al (2001) Atlas of mental health resources in the world 2001. World Health Organization, Technical Report Pramodh KC, Vijayalata Y (2016) Automatic personality recognition of authors using big five factor model. In: 2016 IEEE international conference on advances in computer applications (ICACA), IEEE, pp 32–37 Quan Y, Zhong X, Feng W, Chan JC-W, Li Q, Xing M (2021) SMOTE-based weighted deep rotation forest for the imbalanced hyperspectral data classification. Remote Sens 13(3):464 Reimers N, Gurevych I (2019) Sentence-bert: Sentence embeddings using siamese bert-networks, arXiv preprint arXiv:1908.10084 Rekik A, Jamoussi S, Hamadou AB (2019) Violent vocabulary extraction methodology: application to the radicalism detection on social media. In: International conference on computational collective intelligence, Springer, pp. 97–109 Ruiz AP, Gila AA, Irusta U, Huguet JE (2020) Why deep learning performs better than classical machine learning? Dyna Ingenieria E Industria 95(1):119–122 Salem MS, Ismail SS, Aref M (2019) Personality traits for egyptian twitter users dataset. In: Proceedings of the 2019 8th international conference on software and information engineering, pp 206–211 Schwartz HA, Eichstaedt JC, Kern ML, Dziurzynski L, Ramones SM, Agrawal M, Shah A, Kosinski M, Stillwell D, Seligman ME et al (2013) Personality, gender, and age in the language of social media: the open-vocabulary approach. PloS One 8(9):e73791 Shen Y, He X, Gao J, Deng L, Mesnil G (2014) Learning semantic representations using convolutional neural networks for web search. In: Proceedings of the 23rd international conference on world wide web, pp 373–374 Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys D Nonlinear Phenom 404:132306 Stankevich M, Smirnov I, Kiselnikova N, Ushakova A (2019) Depression detection from social media profiles. In: International conference on data analytics and management in data intensive domains. Springer, pp 181–194 Thaiyalnayaki K (2021) classification of diabetes using deep learning and SVM techniques. Int J Curr Res Rev 13(01):146 Varshney V, Varshney A, Ahmad T, Khan AM (2017) Recognising personality traits using social media. In: 2017 IEEE international conference on power, control, signals and instrumentation engineering (ICPCSI), IEEE, pp 2876–2881 Wang L, You Z-H, Chen X, Li Y-M, Dong Y-N, Li L-P, Zheng K (2019) LMTRDA: using logistic model tree to predict MiRNA-disease associations by fusing multi-source information of sequences and similarities. PLoS Comput Biol 15(3):e1006865 Wang C, Wang B, Xu M (2019) Tree-structured neural networks with topic attention for social emotion classification. IEEE Access 7:95505–95515 Wang B, Wu Y, Vaci N, Liakata M, Lyons T, Saunders KE (2021) Modelling paralinguistic properties in conversational speech to detect bipolar disorder and borderline personality disorder. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 7243–7247 Yih W-t, He X, Meek C (2014) Semantic parsing for single-relation question answering. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, Vol 2: Short Papers, pp 643–648