Cyber-security: Identity deception detection on social media platforms

Computers & Security - Tập 78 - Trang 76-89 - 2018
Estee van der Walt1, J.H.P. Eloff1, Jacomine Grobler2
1Department of Computer Science, Information Technology Building - Level 4, University of Pretoria, Lynnwood Road, Pretoria, South Africa
2Department of Industrial and Systems Engineering, Engineering building 2 Level 3, University of Pretoria, Lynnwood Road Pretoria, South Africa

Tài liệu tham khảo

Al-Garadi, 2016, Cybercrime detection in online communications: the experimental case of cyberbullying detection in the Twitter network, Comput Hum Behav, 63, 433, 10.1016/j.chb.2016.05.051 Alowibdi, 2015, Deception detection in Twitter, Soc Netw Anal Min, 5, 1, 10.1007/s13278-015-0273-1 Anguita, 2012, The ‘K'in K-fold cross validation, 441 2017, Ceding powers of decision to AI presents a paradox, Financial Times Armstrong Assunção, 2015, Big data computing and clouds: trends and future directions, J Parallel Distrib Comput, 79, 3, 10.1016/j.jpdc.2014.08.003 Baehrens, 2010, How to explain individual classification decisions, J Mach Learn Res, 11, 1803 Beillevaire, M. 2017. Inside the Black Box: How to Explain Individual Predictions of a Machine Learning Model. Computer Science and Engineering Masters, KTH Royal Institute of Technology. Bellinger, 2012, One-class versus binary classification: which and when?, 102 Benevenuto, 2010, Detecting spammers on twitter, 12 Biau, 2012, Analysis of a random forests model, J Mach Learn Res, 13, 1063 Bliss, 2017, The law, social media and the victimisation of women Bogdanova, 2014, Exploring high-level features for detecting cyberpedophilia, Comput Speech Lang, 28, 108, 10.1016/j.csl.2013.04.007 Breiman, 2001, Random forests, Mach Learn, 45, 5, 10.1023/A:1010933404324 Burrell, 2016, How the machine ‘thinks’: understanding opacity in machine learning algorithms, Big Data Soc, 3, 10.1177/2053951715622512 Camber Caspi, 2006, Online deception: prevalence, motivation, and emotion, Cyber Psychol Behav, 9, 54, 10.1089/cpb.2006.9.54 Chaffey, D. 2018. Global social media research summary[Online]. Smart Insights. Available: https://www.smartinsights.com/social-media-marketing/social-media-strategy/new-global-social-media-research/ [Accessed 23 Jun 2018]. Choudhary, 2017, Towards filtering of SMS spam messages using machine learning based technique Chu, 2010, Who is tweeting on Twitter: human, bot, or cyborg?, 21 Cook, 2014, Birds of a feather deceive together: the chicanery of multiplied metadata, J Inf Warfare, 13, 85 Cresci, 2015, Fame for sale: efficient detection of fake Twitter followers, Decis Supp Syst, 80, 56, 10.1016/j.dss.2015.09.003 Dal Pozzolo, 2013, Racing for unbalanced methods selection, 24 De Villiers Depaulo, 1996, Lying in everyday life, J Pers Soc Psychol, 70, 979, 10.1037/0022-3514.70.5.979 Dickerson, 2014, Using sentiment to detect bots on Twitter: are humans more opinionated than bots?, 620 Digital, 2016, Mbete doesn't recognise these tweets - speaker spoofed on Twitter, Times Live Drouin, 2016, Why do people lie online? ``Because everyone lies on the internet", Comput Hum Behav, 64, 134, 10.1016/j.chb.2016.06.052 Ebrahimi, 2016, Recognizing predatory chat documents using semi-supervised anomaly detection, Electron Imag, 2016, 1, 10.2352/ISSN.2470-1173.2016.17.DRR-063 Ferrara, 2016, Predicting online extremism, content adopters, and interaction reciprocity, 22 Fire, 2014, Friend or foe? Fake profile identification in online social networks, Soc Netw Anal Min, 4, 1, 10.1007/s13278-014-0194-4 Galán-García, 2016, Supervised machine learning for the detection of troll profiles in twittersocial network: Application to a real case of cyberbullying, Log J IGPL, 24, 42 Genuer, 2010, Variable selection using random forests, Pattern Recognit Lett, 31, 2225, 10.1016/j.patrec.2010.03.014 Goodman, 2016, ICML workshop on human interpretability in machine learning Gu, 2008, Clustering analysis of network traffic for protocol-and structure-independent botnet detection, 139 Gurajala, 2015, Fake Twitter accounts: profile characteristics obtained using an activity-based pattern detection approach, 9 Gurajala, 2016, Profile characteristics of fake Twitter accounts, Big Data Soc, 3, 10.1177/2053951716674236 Haimson, 2016, Constructing and enforcing “authentic” identity online: Facebook, real names, and non-normative identities, First Monday, 21, 10.5210/fm.v21i6.6791 Halevy, 2014, Being honest about dishonesty: correlating self‐reports and actual lying, Hum Commun Res, 40, 54, 10.1111/hcre.12019 Hancock, 2007 Hancock, 2009, Putting your best face forward: the accuracy of online dating photographs, J Commun, 59, 367, 10.1111/j.1460-2466.2009.01420.x Jeni, 2013, Facing imbalanced data–Recommendations for the use of performance metrics, 245 Jupe, 2016, The lies we live: using the verifiability approach to detect lying about occupation, J Artic Support Null Hypothesis, 13, 1 Kabay, 2014, Anonymity and identity in cyberspace, Comput Secur Handb Sixth Ed, 70, 1 Keen Khandpur, 2017, Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media, 1049 Kierkegaard, 2008, Cybering, online grooming and ageplay, Comput Law Secur Rev, 24, 41, 10.1016/j.clsr.2007.11.004 Klausen, 2015, Tweeting the jihad: social media networks of Western foreign fighters in Syria and Iraq, Stud Conflict Terrorism, 38, 1, 10.1080/1057610X.2014.974948 Kothari, 2004 Kuhn, 2016, Caret: classification and regression training, R Packag Version, 6, 0 Levenshtein, 1966, Binary codes capable of correcting deletions, insertions, and reversals, Soviet Phys Doklady, 707 LI, 2015, A framework of identity resolution: evaluating identity attributes and matching algorithms, Secur Inform, 4, 1, 10.1186/s13388-015-0021-0 Lipton, 2016, ICML Workshop on Human Interpretability in Machine Learning Ma, 2014, Machine learning for Big Data analytics in plants, Trends Plant Sci, 19, 798, 10.1016/j.tplants.2014.08.004 Mann, 1947, On a test of whether one of two random variables is stochastically larger than the other, Ann Math Stat, 50, 10.1214/aoms/1177730491 Mcdonald, 2009 Menardi, 2014, Training and assessing classification rules with imbalanced data, Data Min Knowl Discov, 28, 1, 10.1007/s10618-012-0295-5 Oentaryo, 2016, On profiling bots in social media, 92 Peddinti, 2017, International AAAI Conference on Web and Social Media Peterson, T. 2016. Rapist who used social media to lure childvictims sentenced to 20 years[Online]. Available: http://www.news24.com/SouthAfrica/News/rapist-who-used-social-media-to-lure-child-victims-sentenced-to-20-years-20160615 [Accessed]. Rényi, 1961, On measures of entropy and information, 547 Ribeiro, 2016, Why should i trust you?: explaining the predictions of any classifier, 1135 Rong, 2016, Money or friends: social identity and deception in networks, Eur Econ Rev, 90, 56, 10.1016/j.euroecorev.2016.04.003 Rubin, 2017, The SAGE Handbook of Social Media Research Methods Saabas, A. 2018. Package for interpreting scikit-learn's decision tree and random forest predictions.[Online]. Available: https://pypi.org/project/treeinterpreter/ [Accessed 23 Jun 2018]. Schwartz, 2013, Personality, gender, and age in the language of social media: the open-vocabulary approach, PloS One, 8, e73791, 10.1371/journal.pone.0073791 Sedhai, 2017, Semi-Supervised Spam Detection in Twitter Stream, IEEE Transactions on Computational Social Systems, 5, 169, 10.1109/TCSS.2017.2773581 Shannon, 2001, A mathematical theory of communication, ACM SIGMOBILE Mob Comput Commun Rev, 5, 3, 10.1145/584091.584093 Shapley, 1953, A value for n-person games, Contrib Theory Games, 2, 307 Shumaker, 1984, Astronomical computing: 1. Computing under the open sky. 2. Virtues of the haversine, Sky Telesc, 68, 158 Smit, 2015, Cyberbullying in South African and American schools: a legal comparative study, South Afr J Educ, 35, 01, 10.15700/saje.v35n2a1076 Stanton, 2016, Development and validation of a measure of online deception and intimacy, Personal Individ Differ, 88, 187, 10.1016/j.paid.2015.09.015 Thomas, 2011, Suspended accounts in retrospect: an analysis of twitter spam, 243 Toma, 2008, Separating fact from fiction: an examination of deceptive self-presentation in online dating profiles, Personal Soc Psychol Bull, 34, 1023, 10.1177/0146167208318067 Tsikerdekis, 2017, Identity deception prevention using common contribution network data, IEEE Trans Inf Forensics Secur, 12, 188, 10.1109/TIFS.2016.2607697 Tsikerdekis, 2014, Multiple account identity deception detection in social media using nonverbal behavior, IEEE Trans Inf Forensics Secur, 9, 1311, 10.1109/TIFS.2014.2332820 Tuna, 2016, User characterization for online social networks, Soc Netw Anal Min, 6, 104, 10.1007/s13278-016-0412-3 Tuteja, 2016, A survey on classification algorithms for email spam filtering, Int J Eng Sci, 5937 Utz, 2005, Types of deception and underlying motivation: What people think, Soc Sci Comput Rev, 23, 49, 10.1177/0894439304271534 Van Der Walt, 2018, Using machine learning to detect fake identities - Bots versus Humans, IEEE Access, 6, 6540, 10.1109/ACCESS.2018.2796018 Van Liere, 2010, How far does a tweet travel?: Information brokers in the twitterverse, 6 Venkatesan, 2017, Detecting Stealthy Botnets in a Resource-Constrained Environment using Reinforcement Learning, 75 Wang, 2006, Automatically detecting criminal identity deception: an adaptive detection algorithm, IEEE Trans Syst Man Cybern Part A: Syst Hum, 36, 988, 10.1109/TSMCA.2006.871799 Wolpert, 1997, No free lunch theorems for optimization, IEEE Trans Evol Comput, 1, 67, 10.1109/4235.585893 Yarkoni, 2016, Choosing prediction over explanation in psychology: Lessons from machine learning, Perspectives on Psychological Science: J Assoc Psychol Sci, 12, 1100, 10.1177/1745691617693393 Zafarani, 2015, Evaluation without ground truth in social media research, Commun ACM, 58, 54, 10.1145/2666680