Impact of reviewer social interaction on online consumer review fraud detection

Journal of Big Data - Tập 4 - Trang 1-19 - 2017
Kunal Goswami1, Younghee Park1, Chungsik Song1
1Department of Computer Engineering, San Jose State University, San Jose, USA

Tóm tắt

Online consumer reviews have become a baseline for new consumers to try out a business or a new product. The reviews provide a quick look into the application and experience of the business/product and market it to new customers. However, some businesses or reviewers use these reviews to spread fake information about the business/product. The fake information can be used to promote a relatively average product/business or can be used to malign their competition. This activity is known as reviewer fraud or opinion spam. The paper proposes a feature set, capturing the user social interaction behavior to identify fraud. The problem being solved is one of the characteristics that lead to fraud rather than detecting fraud. Neural network algorithm is used to evaluate the proposed feature set and compare it against the state-of-the-art feature sets in detecting fraud. The feature set considers the user’s social interaction on the Yelp platform to determine if the user is committing fraud. The neural network algorithm helps in comparing the feature set with other feature sets used to detect fraud. Any attempt to find the characteristics that lead to fraud has a prerequisite to be good enough to detect fraud as well. The F1 score obtained using neural networks is on par with all the well-known methods for detecting fraud, a value of 0.95. The effectiveness of the feature set is in rivaling the other approaches to fraud detection. A user’s social interaction on a digital platform such as Yelp is equally important in evaluating the user as social interaction is in real life. The characteristics that lead to fraud can be intuitively captured. The characteristics such as number of friends, number of followers and the number of times the user has provided a review which was helpful to multiple people provide the neural network with a base to form a relationship between opinion fraud and social interaction characteristics.

Tài liệu tham khảo

Akoglu L, Chandy R, Faloutsos C. Opinion fraud detection in online reviews by network effects. In: International AAAI conference on web and social media; 2013. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM13/paper/view/5981. Anderson E, Simester D. Deceptive reviews: the influential tail. Tech Rep. 2013;2:1. (Citeseer). Anderson ET, Simester DI. Reviews without a purchase: low ratings, loyal customers, and deception. J Mark Res. 2014;51(3):249–69. doi:10.1509/jmr.13.0209. Barabsi AL, Albert R, Jeong H. Scale-free characteristics of random networks: the topology of the world-wide web. Physica A Stat Mech Appl. 2000;281:69–77. doi:10.1016/S0378-4371(00)00018-2. Bebis G, Georgiopoulos M. Feed-forward neural networks. IEEE Potentials. 1994;13(4):27–31. doi:10.1109/45.329294. Benczur AA, Csalogany K, Sarlos T, Uher M. Spamrank-fully automatic link spam detection, work in progress. In: AIRWeb’05. First international workshop on adversarial information retrieval on the web. Chiba, 2005. p. 1–14, http://eprints.sztaki.hu/4029/. Chevalier JA, Mayzlin D. The effect of word of mouth on sales: online book reviews. J Mark Res. 2006;43(3):345–54. doi:10.1509/jmkr.43.3.345. Christodoulou C, Georgiopoulos M. Applications of neural networks in electromagnetics. 1st ed. Norwood: Artech House Inc; 2000. Cross S, Harrison R, Kennedy R. Introduction to neural networks. Lancet. 1995;346(8982):1075–9. doi:10.1016/S0140-6736(95)91746-2. Feng S, Banerjee R, Choi Y. Syntactic stylometry for deception detection. In: Proceedings of the 50th annual meeting of the association for computational linguistics: short papers, Vol. 2. Association for computational linguistics. ACL ’12: Stroudsburg; 2012. p. 171–5. http://dl.acm.org/citation.cfm?id=2390665.2390708. Floyd K, Freling R, Alhoqail S, Cho HY, Freling T. How online product reviews affect retail sales: a meta-analysis. J Retail. 2014;90(2):217–32. doi:10.1016/j.jretai.2014.04.004. Gardner M, Dorling S. Artificial neural networks (the multilayer perceptron)a review of applications in the atmospheric sciences. Atmos Environ. 1998;32(1415):2627–36. doi:10.1016/S1352-2310(97)00447-0. Hedley J. jsoup: Java html parser. 2009–2016. https://jsoup.org/. Hirose Y, Yamashita K, Hijiya S. Back-propagation algorithm which varies the number of hidden units. Neural Netw. 1991;4(1):61–6. doi:10.1016/0893-6080(91)90032-Z. Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989;2(5):359–66. doi:10.1016/0893-6080(89)90020-8. Jiang M, Cui P, Beutel A, Faloutsos C, Yang S. Catchsync: catching synchronized behavior in large directed graphs. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14. New York: ACM; 2014. p. 941–50. Jindal N, Liu B. Review spam detection. In: Proceedings of the 16th international conference on world wide web, WWW ’07. New York: ACM; 2007. p. 1189–90. Jindal N, Liu B. Opinion spam and analysis. In: Proceedings of the 2008 international conference on web search and data mining. New York: ACM; 2008. p. 219–30. Jindal N, Liu B, Lim EP. Finding unusual review patterns using unexpected rules. In: Proceedings of the 19th ACM international conference on information and knowledge management. New York: ACM; 2010. p. 1549–52. Kalogirou SA. Applications of artificial neural-networks for energy systems. Appl Energy. 2000;67(12):17–35. doi:10.1016/S0306-2619(00)00005-2. Kumar R, Novak J, Tomkins A. Structure and evolution of online social networks. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’06. New York: ACM; 2006. p. 611–7. Lawrence S, Giles CL, Tsoi AC, Back AD. Face recognition: a convolutional neural-network approach. IEEE Trans Neural Netw. 1997;8(1):98–113. doi:10.1109/72.554195. Li F, Huang M, Yang Y, Zhu X. Learning to identify review spam. In: Proceedings of the twenty-second international joint conference on artificial intelligence, Vol 3. AAAI Press, IJCAI’11; 2011. p. 2488–93. Li H, Chen Z, Liu B, Wei X, Shao J. Spotting fake reviews via collective positive-unlabeled learning. In: Proceedings of the 2014 IEEE international conference on data mining, IEEE computer society. Washington: ICDM ’14; 2014. p. 899–904. Lim EP, Nguyen VA, Jindal N, Liu B, Lauw HW. Detecting product review spammers using rating behaviors. In: Proceedings of the 19th ACM international conference on information and knowledge management. New York: ACM, CIKM ’10; 2010. p. 939–48. Luca M. Reviews, reputation, and revenue: the case of yelp. com. Com (September 16, 2011) Harvard Business School NOM Unit Working Paper (12-016). 2011. Luca M, Zervas G. Fake it till you make it: reputation, competition, and yelp review fraud. Manag Sci DOI. 2016. doi:10.2139/ssrn.2293164. Mayzlin D, Dover Y, Chevalier J. Promotional reviews: an empirical investigation of online review manipulation. Am Eco Rev. 2014;104(8):2421–55. Mukherjee A, Liu B, Wang J, Glance N, Jindal N. Detecting group review spam. In: Proceedings of the 20th international conference companion on world wide web. New York: ACM, WWW ’11; 2011. p. 93–4. Mukherjee A, Liu B, Glance N. Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st international conference on world wide web. New York: ACM, WWW ’12; 2012. p. 191–200. Ott M, Choi Y, Cardie C, Hancock JT. Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Vol. 1. Association for computational linguistics. , HLT ’11: stroudsburg; 2011. p. 309–19. http://dl.acm.org/citation.cfm?id=2002472.2002512. Ott M, Cardie C, Hancock J. Estimating the prevalence of deception in online review communities. In: Proceedings of the 21st international conference on world wide web. New York: ACM, WWW ’12; 2012. p. 201–10. Pao YH. Adaptive pattern recognition and neural networks. Boston: Addison-Wesley Longman Publishing Co.; 1989. Pomerleau DA. Efficient training of artificial neural networks for autonomous navigation. Neural Comput. 1991;3(1):88–97. doi:10.1162/neco.1991.3.1.88. Pomerleau DA. Knowledge-based training of artificial neural networks for autonomous robot driving. In: Robot learning. Springer: New York; 1993. p. 19–43. Rayana S, Akoglu L. Collective opinion spam detection: Bridging review networks and metadata. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM, KDD ’15; 2015. p. 985–94. Rowley HA, Baluja S, Kanade T. Neural network-based face detection. IEEE Trans Pattern Anal Mach Intell. 1998;20(1):23–38. doi:10.1109/34.655647. Streitfeld D. The best book reviews money can buy. New York Times 25. 2012. http://www.nytimes.com/2012/08/26/business/book-reviewers-for-hire-meet-a-demand-for-online-raves.html. Accessed 11 Sept 2016. Wang G, Xie S, Liu B, Yu PS. Review graph based online store review spammer detection. In: Proceedings of the 2011 IEEE 11th international conference on data mining, IEEE computer society. ICDM ’11: Washington; 2011. p. 1242–7. Widrow B, Lehr MA. 30 years of adaptive neural networks: perceptron, madaline, and backpropagation. Proc IEEE. 1990;78(9):1415–42. doi:10.1109/5.58323. Xu C, Zhang J. Towards collusive fraud detection in online reviews. In: 2015 IEEE international conference on data mining; 2015. p. 1051–6. Ye J, Akoglu L. Discovering opinion spammer groups by network footprints. Cham: Springer International Publishing; 2015. You Y, Vadakkepatt GG, Joshi AM. A meta-analysis of electronic word-of-mouth elasticity. J Market. 2015;79(2):19–39. doi:10.1509/jm.14.0169. Zhang J, Yan Y, Lades M. Face recognition: eigenface, elastic matching, and neural nets. IEEE Proc. 1997;85(9):1423–35. doi:10.1109/5.628712.