Privacy concerns in social media UGC communities: Understanding user behavior sentiments in complex networks
Springer Science and Business Media LLC - Trang 1-21 - 2023
Tóm tắt
In a digital ecosystem where large amounts of data related to user actions are generated every day, important concerns have emerged about the collection, management, and analysis of these data and, according, about user privacy. In recent years, users have been accustomed to organizing in and relying on digital communities to support and achieve their goals. In this context, the present study aims to identify the main privacy concerns in user communities on social media, and how these affect users’ online behavior. In order to better understand online communities in social networks, privacy concerns, and their connection to user behavior, we developed an innovative and original methodology that combines elements of machine learning as a technical contribution. First, a complex network visualization algorithm known as ForceAtlas2 was used through the open-source software Gephi to visually identify the nodes that form the main communities belonging to the sample of UGC collected from Twitter. Then, a sentiment analysis was applied with Textblob, an algorithm that works with machine learning on which experiments were developed with support vector classifier (SVC), multinomial naïve Bayes (MNB), logistic regression (LR), random forest, and classifier (RFC) under the theoretical frameworks of computer-aided text analysis (CATA) and natural language processing (NLP). As a result, a total of 11 user communities were identified: the positive protection software and cybersecurity and eCommerce, the negative privacy settings, personal information and social engineering, and the neutral privacy concerns, hacking, false information, impersonation and cookies data. The paper concludes with a discussion of the results and their relation to user behavior in digital environments and an outline valuable and practical insights into some techniques and challenges related to users’ personal data.
Tài liệu tham khảo
Afolabi OO, Ozturen A, Ilkan M (2021) Effects of privacy concern, risk, and information control in a smart tourism destination. Econ Res Ekon Istraž 34(1):3119–3138. https://doi.org/10.1080/1331677X.2020.1867215
Almomani A, Alauthman M, Shatnawi MT, Alweshah M, Alrosan A, Alomoush W, Gupta BB (2022) Phishing website detection with semantic features based on machine learning classifiers: a comparative study. Int J Semant Web Inf Syst (IJSWIS) 18(1):1–24. https://doi.org/10.4018/IJSWIS.297032
Alowibdi JS, Alshdadi AA, Daud A, Dessouky MM, Alhazmi EA (2021) Coronavirus pandemic (covid-19): emotional toll analysis on twitter. Int J Semant Web Inf Syst (IJSWIS) 17(2):1–21. https://doi.org/10.4018/IJSWIS.2021040101
Arora S, Bawa A (2022) Response to personalized marketing communication: an empirical investigation comparing users and non users of surrogate shoppers. J Internet Commer 21(2):246–269. https://doi.org/10.1080/15332861.2021.1947741
Arslan O, Xing W, Inan FA, Du H (2022) Understanding topic duration in Twitter learning communities using data mining. J Comput Assist Learn 38(2):513–525. https://doi.org/10.1111/jcal.12633
Ausserhofer J, Maireder A (2013) National politics on Twitter: structures and topics of a networked public sphere. Inf Commun Soc 16(3):291–314. https://doi.org/10.1080/1369118X.2012.756050
Baldassarre MT, Barletta VS, Caivano D, Scalera M (2019) Privacy oriented software development. In: International conference on the quality of information and communications technology. Springer, Cham, pp 18–32
Barbosa B, Saura JR, Bennett D (2022) How do entrepreneurs perform digital marketing across the customer journey? A review and discussion ofthe main uses. J Tech Transf. https://doi.org/10.1007/s10961-022-09978-2
Blom JN, Hansen KR (2015) Click bait: forward-reference as lure in online news headlines. J Pragmat 76:87–100. https://doi.org/10.1016/j.pragma.2014.11.010
Blondel VD, Guillaume J, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech 2008:2008. https://doi.org/10.1088/1742-5468/2008/10/P10008
Bouarara HA (2021) Recurrent neural network (RNN) to analyse mental behaviour in social media. Int J Softw Sci Comput Intell (IJSSCI) 13(3):1–11. https://doi.org/10.4018/IJSSCI.2021070101
Bouguessa M, Romdhane LB (2015) Identifying authorities in online communities. ACM Trans Intell Syst Technol (TIST) 6(3):1–23. https://doi.org/10.1145/2700481
Brandtzaeg PB, Lüders M, Spangenberg J, Rath-Wiggins L, Følstad A (2016) Emerging journalistic verification practices concerning social media. J Pract 10(3):323–342. https://doi.org/10.1080/17512786.2015.1020331
Carter MA (2013) Protecting oneself from cyber bullying on social media sites–a study of undergraduate students. Procedia Soc Behav Sci 93:1229–1235. https://doi.org/10.1016/j.sbspro.2013.10.020
Chawra VK, Gupta GP (2022) Optimization of the wake-up scheduling using a hybrid of memetic and tabu search algorithms for 3D-wireless sensor networks. Int J Softw Sci Comput Intell (IJSSCI) 14(1):1–18. https://doi.org/10.4018/IJSSCI.300359
Chen TY, Chen YM, Tsai MC (2020) A status property classifier of social media user’s personality for customer-oriented intelligent marketing systems: intelligent-based marketing activities. Int J Semant Web Inf Syst (IJSWIS) 16(1):25–46. https://doi.org/10.4018/IJSWIS.2020010102
Chopra M, Singh SK, Gupta A, Aggarwal K, Gupta BB, Colace F (2022) Analysis and prognosis of sustainable development goals using big data-based approach during COVID-19 pandemic. Sustain Technol Entrep 1(2):100012. https://doi.org/10.1016/j.stae.2022.100012
Deuker A (2012) Friend-to-friend privacy protection on social networking sites: A grounded theory study. AMCIS 2012 Proceedings. 5. https://aisel.aisnet.org/amcis2012/proceedings/SocialIssues/5
Di Caprio D, Santos-Arteaga FJ, Tavana M (2022) An information retrieval benchmarking model of satisficing and impatient users’ behavior in online search environments. Expert Syst Appl 191:116352. https://doi.org/10.1016/j.eswa.2021.116352
Fogel J, Nehmad E (2009) Internet social network communities: risk taking, trust, and privacy concerns. Comput Hum Behav 25(1):153–160. https://doi.org/10.1016/j.chb.2008.08.006
Franco M, Esteves L (2020) Inter-clustering as a network of knowledge and learning: multiple case studies. J Innov Knowl 5(1):39–49. https://doi.org/10.1016/j.jik.2018.11.001
González-Padilla P, López AF, Lacárcel FJ (2022) Main government-related data extraction techniques: a review. In: Handbook of research on artificial intelligence in government practices and processes, pp 142–160. https://doi.org/10.4018/978-1-7998-9609-8.ch009
Gordon GR, Rebovich DDJ, Choo KS (2007) Identity fraud trends and patterns. In: Center for Identity Management and Information Protection, Utica College
Griol D, Molina JM, Callejas Z (2017) Combining speech-based and linguistic classifiers to recognize emotion in user spoken utterances. Neurocomputing 2017:1–9. https://doi.org/10.1016/j.neucom.2017.01.120
Gu J, Vo ND, Jung JJ (2022) Contextual Word2Vec model for understanding chinese out of vocabularies on online social media. Int J Semant Web Inf Syst (IJSWIS) 18(1):1–14. https://doi.org/10.4018/IJSWIS.309428
Guerola-Navarro V, Stratu-Strelet D (2022) Media or information literacy as variables for citizen participation in public decision-making? A bibliometric overview. Sustain Technol Entrep 2:100012
Gupta S, Gupta BB, Chaudhary P (2018) Hunting for DOM-based XSS vulnerabilities in mobile cloud-based online social network. Futur Gener Comput Syst 79:319–336. https://doi.org/10.1016/j.future.2017.05.038
Harridge-March S (2006) Can the building of trust overcome consumer perceived risk online? Mark Intell Plan. https://doi.org/10.1108/02634500610711897
Heikal M, Eldawlatly S (2020) An ensemble classification technique of neurodegenerative diseases from gait analysis. In: 2020 15th International conference on computer engineering and systems (ICCES). IEEE, pp 1–6
Hilvert-Bruce Z, Neill JT (2020) I’m just trolling: the role of normative beliefs in aggressive behaviour in online gaming. Comput Hum Behav 102:303–311. https://doi.org/10.1016/j.chb.2019.09.003
Hiremath BN, Patil MM (2020) Enhancing optimized personalized therapy in clinical decision support system using natural language processing. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2020.03.006
Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266. https://doi.org/10.1126/science.aaa8685
Hussein DMEDM (2018) A survey on sentiment analysis challenges. J King Saud Univ Eng Sci 30(4):330–338. https://doi.org/10.1016/j.jksues.2016.04.002
Jacomy M, Venturini T, Heymann S, Bastian M (2014) ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS ONE 9(6):e98679. https://doi.org/10.1371/journal.pone.0098679
Java A, Song X, Finin T, Tseng B (2007) Why we twitter: an analysis of a microblogging community. In: International workshop on social network mining and analysis. Springer, Berlin, Heidelberg, pp 118–138
John N, Sam S (2022) Provably secure data sharing approach for personal health records in cloud storage using session password, data access key, and circular interpolation. In: Research anthology on securing medical systems and records. IGI Global, pp 878–902
Jordan T (2019) Does online anonymity undermine the sense of personal responsibility? Media Cult Soc 41(4):572–577. https://doi.org/10.1177/0163443719842073
Jozani M, Ayaburi E, Ko M, Choo KKR (2020) Privacy concerns and benefits of engagement with social media-enabled apps: a privacy calculus perspective. Comput Hum Behav 107:106260. https://doi.org/10.1016/j.chb.2020.106260
Kafeza E, Kanavos A, Makris C, Pispirigos G, Vikatos P (2019) T-PCCE: twitter personality based communicative communities extraction system for big data. IEEE Trans Knowl Data Eng 32(8):1625–1638. https://doi.org/10.1109/TKDE.2019.2906197
Ketelaar PE, Van Balen M (2018) The smartphone as your follower: the role of smartphone literacy in the relation between privacy concerns, attitude and behaviour towards phone-embedded tracking. Comput Hum Behav 78:174–182. https://doi.org/10.1016/j.chb.2017.09.034
Khan R, Rustam F, Kanwal K, Mehmood A, Choi GS (2021) US based COVID-19 tweets sentiment analysis using textblob and supervised machine learning algorithms. In: 2021 International conference on artificial intelligence (ICAI). IEEE, pp 1–8
Kim HS (2016) What drives you to check in on Facebook? Motivations, privacy concerns, and mobile phone involvement for location-based information sharing. Comput Hum Behav 54:397–406. https://doi.org/10.1016/j.chb.2015.08.016
Kim AE, Hansen HM, Murphy J, Richards AK, Duke J, Allen JA (2013) Methodological considerations in analyzing Twitter data. J Natl Cancer Inst Monogr 2013(47):140–146. https://doi.org/10.1093/jncimonographs/lgt026
Kitsios F, Mitsopoulou E, Moustaka E, Kamariotou M (2022) User-Generated Content behavior and digital tourism services: A SEM-neural network model for information trust in social networking sites. Int J Inf Manag Data Insights 2(1):100056. https://doi.org/10.1016/j.jjimei.2021.100056
Kordzadeh N, Warren J, Seifi A (2016) Antecedents of privacy calculus components in virtual health communities. Int J Inf Manag 36(5):724–734. https://doi.org/10.1016/j.ijinfomgt.2016.04.015
Krishnamurthy S, Kucuk SU (2009) Anti-branding on the internet. J Bus Res 62(11):1119–1126. https://doi.org/10.1016/j.jbusres.2008.09.003
Kucuk SU (2016) Exploring the legality of consumer anti-branding activities in the digital age. J Bus Ethics 139(1):77–93. https://doi.org/10.1007/s10551-015-2585-5
Kuo YF, Feng LH (2013) Relationships among community interaction characteristics, perceived benefits, community commitment, and oppositional brand loyalty in online brand communities. Int J Inf Manag 33(6):948–962. https://doi.org/10.1016/j.ijinfomgt.2013.08.005
Lam FS, Chow BY (2022) Disaster response network analysis in rural Temerloh, Pahang communities during the Malaysia 2020–2021 flood. In: E3S Web of conferences, vol 347, pp 05003
Lambiotte R, Delvenne JC, Barahona M (2014) Random walks, Markov processes and the multiscale modular organization of complex networks. IEEE Trans Netw Sci Eng 2014(1):76–90. https://doi.org/10.1109/TNSE.2015.2391998
Lease M (2011) On quality control and machine learning in crowdsourcing. In: Workshops at the twenty-fifth AAAI conference on artificial intelligence.
Lehdonvirta V, Oksanen A, Rasanen P, Blank G (2021) Social media, web, and panel surveys: using non-probability samples in social and policy research. Pol Internet 13(1):134–155. https://doi.org/10.1002/poi3.23
Liu L, Preotiuc-Pietro D, Samani ZR, Moghaddam ME, Ungar L (2016) Analyzing personality through social media profile picture choice. In: Tenth international AAAI conference on web and social media.
Lacárcel FJS (2022) Main uses of artificial intelligence in digital marketing strategies linked to tourism. J Tourism Sustain Well-being 10(3):215–226. https://doi.org/10.34623/mppf-r253
Lacarcel FJ, Huete R (2023) Digital communication strategies used by private companies, entrepreneurs, and public entities to attract long-stay tourists: a review. Int Entrep Manag J. https://doi.org/10.1007/s11365-023-00843-8
Martín JMM, Fernández JAS (2022) The effects of technological improvements in the train network on tourism sustainability. An approach focused on seasonality. Sustain Technol Entrep 1(1):100005. https://doi.org/10.1016/j.stae.2022.100005
Matt C, Hess T, Benlian A (2015) Digital transformation strategies. Bus Inf Syst Eng 57(5):339–343. https://doi.org/10.1007/s12599-015-0401-5
Mishra, A., Gupta, B. B., & Joshi, R. C. (2011, September). A comparative study of distributed denial of service attacks, intrusion tolerance and mitigation techniques. In: 2011 European intelligence and security informatics conference. IEEE, pp 286–289
Mohammed ZA, Tejay GP (2017) Examining privacy concerns and ecommerce adoption in developing countries: the impact of culture in shaping individuals’ perceptions toward technology. Comput Secur 67:254–265. https://doi.org/10.1016/j.cose.2017.03.001
Mohammed SS, Menaouer B, Zohra AFF, Nada M (2022) Sentiment analysis of COVID-19 tweets using adaptive neuro-fuzzy inference system models. Int J Softw Sci Comput Intell (IJSSCI) 14(1):1–20. https://doi.org/10.4018/IJSSCI.300361
Montalvo RE (2011) Social media management. Int J Manag Inf Sys (IJMIS) 15(3):91–96. https://doi.org/10.19030/ijmis.v15i3.4645
Muhammad SS, Dey BL, Weerakkody V (2018) Analysis of factors that influence customers’ willingness to leave big data digital footprints on social media: a systematic review of literature. Inf Syst Front 20(3):559–576. https://doi.org/10.1007/s10796-017-9802-y
Muniz AM, O’guinn TC (2001) Brand community. J Consum Res 27(4):412–432. https://doi.org/10.1086/319618
Neethu MS, Rajasree R (2013) Sentiment analysis in twitter using machine learning techniques. In: 2013 fourth international conference on computing, communications and networking technologies (ICCCNT). IEEE, pp 1–5
Ortiz SM (2019) “You can say I got desensitized to it”: how men of color cope with everyday racism in online gaming. Sociol Perspect 62(4):572–588. https://doi.org/10.1177/0731121419837588
Pal D, Zhang X, Siyal S (2021) Prohibitive factors to the acceptance of internet of things (IoT) technology in society: a smart-home context using a resistive modelling approach. Technol Soc 66:101683. https://doi.org/10.1016/j.techsoc.2021.101683
Park YJ (2013) Digital literacy and privacy behavior online. Commun Res 40(2):215–236. https://doi.org/10.1177/0093650211418338
Rafail P (2018) Nonprobability sampling and Twitter: strategies for semibounded and bounded populations. Soc Sci Comput Rev 36(2):195–211. https://doi.org/10.1177/0894439317709431
Rasmusen SC, Penz M, Widauer S, Nako P, Kurteva A, Roa-Valverde A, Fensel A (2022) Raising consent awareness with gamification and knowledge graphs: an automotive use case. Int J Semant Web Inf Syst (IJSWIS) 18(1):1–21. https://doi.org/10.4018/IJSWIS.300820
Ribeiro-Navarrete S, Saura JR, Palacios-Marqués D (2021) Towards a new era of mass data collection: assessing pandemic surveillance technologies to preserve user privacy. Technol Forecast Soc Change 167:120681. https://doi.org/10.1016/j.techfore.2021.120681
Rodríguez-Priego N, Porcu L (2022) Challenges in times of a pandemic: What drives and hinders the adoption of location-based applications? Econ Res Ekon Istraž 35(1):438–457. https://doi.org/10.1080/1331677X.2021.1902364
Roseman IJ, Smith CA (2001) Appraisal theory. In: Scherer KR, Schorr A, Johnstone T (eds) Appraisal processes in emotion: Theory, methods, research. Oxford University Press, Oxford, pp 3–19
Sahoo SR, Gupta BB (2020) Classification of spammer and nonspammer content in online social network using genetic algorithm-based feature selection. Enterp Inf Syst 14(5):710–736. https://doi.org/10.1080/17517575.2020.1712742
Santos ZR, Cheung CM, Coelho PS, Rita P (2022) Consumer engagement in social media brand communities: a literature review. Int J Inf Manag 63:102457. https://doi.org/10.1016/j.ijinfomgt.2021.102457
Sarkar D, Markovski S, Gusev M, Tomp D, Muravyov S, Filchenkov A, Parundekar A, Elias S, Ashok A, Sujitparapitaya S, Shirani A, Roldan M, Bonta V, Kumaresh N, Janardhan N, Mishra S, Pappu A, Bhamidipati N, He L, Zheng K, Bandi A, Fellah A (2019) Inferring advertiser sentiment in online articles using Wikipedia footnotes. Adv Intell Syst Comput 2:1224–1231. https://doi.org/10.29007/kzk1
Saura JR (2021) Using data sciences in digital marketing: framework, methods, and performance metrics. J Innov Knowl 6(2):92–102. https://doi.org/10.1016/j.jik.2020.08.001
Saura JR, Reyes-Menendez A, Bennett DR (2019) How to extract meaningfulinsights from UGC: a knowledge-based method applied to education. Appl Sci 9(21):4603. https://doi.org/10.3390/app9214603
Saura JR, Palacios-Marqués D, Iturricha-Fernández A (2021a) Ethical design in social media: assessing the main performance measurements of user online behavior modification. J Bus Res 129:271–281. https://doi.org/10.1016/j.jbusres.2021.03.001
Saura JR, Ribeiro-Soriano D, Palacios-Marqués D (2021b) From user-generated data to data-driven innovation: a research agenda to understand user privacy in digital markets. Int J Inf Manag 60:102331. https://doi.org/10.1016/j.ijinfomgt.2021.102331
Saura JR, Ribeiro-Soriano D, Palacios-Marqués D (2021c) Setting privacy “by default” in social IoT: theorizing the challenges and directions in big data research. Big Data Res 25:100245. https://doi.org/10.1016/j.bdr.2021.100245
Saura JR, Ribeiro-Soriano D, Palacios-Marqués D (2022a) Assessing behavioral data science privacy issues in government artificial intelligence deployment. Gov Inf Q. https://doi.org/10.1016/j.giq.2022.101679
Saura JR, Palacios-Marqués D, Ribeiro-Soriano D (2022b) Exploring the boundaries of open innovation: evidence from social media mining. Technovation. https://doi.org/10.1016/j.technovation.2021.102447
Saura JR, Ribeiro-Soriano D, Iturricha-Fernández A (2022c) Exploring the challenges of remote work on Twitter users’ sentiments: from digital technology development to a post-pandemic era. J Bus Res 142:242–254. https://doi.org/10.1016/j.jbusres.2021.12.052
Schillewaert N, Langerak F, Duharnel T (1998) Non-probability sampling for WWW surveys: a comparison of methods. Mark Res Soc J 40(4):1–13. https://doi.org/10.1177/147078539804000403
Sharma I, Jain K, Singh G (2020) Effect of online political incivility on partisan attitude: role of issue involvement, moral identity and incivility accountability. Online Inf Rev 44(7):1421–1441. https://doi.org/10.1108/OIR-03-2020-0084
Sharma Y, Bhargava R, Tadikonda BV (2021) Named entity recognition for code mixed social media sentences. Int J Softw Sci Comput Intell (IJSSCI) 13(2):23–36. https://doi.org/10.4018/IJSSCI.2021040102
Short JC, Broberg JC, Cogliser CC, Brigham KH (2010) Construct validation using computer-aided text analysis (CATA) an illustration using entrepreneurial orientation. Organ Res Methods 13(2):320–347. https://doi.org/10.1177/1094428109335949
Singh SK, Sachan MK (2021) Classification of code-mixed bilingual phonetic text using sentiment analysis. Int J Semant Web Inf Syst (IJSWIS) 17(2):59–78. https://doi.org/10.4018/IJSWIS.2021040104
Stoycheff E (2022) Cookies and content moderation: affective chilling effects of internet surveillance and censorship. J Inf Technol Polit. https://doi.org/10.1080/19331681.2022.2063215
Taboada M (2016) Sentiment analysis: an overview from linguistics. Annu Rev Linguist 2:325–347
Tidke B, Mehta R, Dhanani J (2020) Multimodal ensemble approach to identify and rank top-k influential nodes of scholarly literature using Twitter network. J Inf Sci 46(4):437–458. https://doi.org/10.1177/0165551519837190
Tokarchuk O, Barr JC, Cozzio C (2022) How much is too much? Estimating tourism carrying capacity in urban context using sentiment analysis. Tour Manag 91:104522. https://doi.org/10.1016/j.tourman.2022.104522
Trott V (2022) Learn about social network analysis in Gephi with the guardian Australia’s Twitter data (2020). SAGE Publications, London
Vijayarani S, Janani R (2016) Text mining: open source tokenization tools-an analysis. Adv Comput Intel Int J (ACII) 3(1):37–47. https://doi.org/10.5121/acii.2016.3104
Williams RL, Cothrel J (2000) Four smart ways to run online communities. MIT Sloan Manag Rev 41(4):81
Yen S, Moh M, Moh TS (2021) Detecting compromised social network accounts using deep learning for behavior and text analyses. Int J Cloud Appl Comput (IJCAC) 11(2):97–109. https://doi.org/10.4018/IJCAC.2021040106
Yin LX, Lin HC (2022) Predictors of customers’ continuance intention of mobile banking from the perspective of the interactivity theory. Econ Res Ekon Istraž. https://doi.org/10.1080/1331677X.2022.2053782
Youn S (2005) Teenagers’ perceptions of online privacy and coping behaviors: a risk–benefit appraisal approach. J Broadcast Electron Media 49(1):86–110. https://doi.org/10.1207/s15506878jobem4901_6
Zhang X, Ghorbani AA (2020) An overview of online fake news: Characterization, detection, and discussion. Inf Process Manag 57(2):102025. https://doi.org/10.1016/j.ipm.2019.03.004
Zuboff S (2015) Big other: surveillance capitalism and the prospects of an information civilization. J Inf Technol 30(1):75–89. https://doi.org/10.1057/jit.2015.5
Zuboff S, Möllers N, Wood DM, Lyon D (2019) Surveillance capitalism: an interview with Shoshana Zuboff. Surveill Soc 17(1/2):257–266