Mining latent relations in peer-production environments: a case study with Wikipedia article similarity and controversy

Social Network Analysis and Mining - Tập 2 - Trang 265-278 - 2011
Chenliang Li1, Anwitaman Datta1, Aixin Sun1
1School of Computer Engineering, Nanyang Technological University, Singapore, Singapore

Tóm tắt

As people participate actively in social networking and peer-production sites, there are additional, implicit relations that emerge from various user activities. Mining such latent relations, or wisdom of crowds, is in itself an important area of ongoing research, with both general as well as domain-specific custom-made techniques. In this paper, we propose a new similarity measure, which we call expert-based similarity to discover semantic relations among Wikipedia articles from the co-editorship perspective. Also, different kinds of relations among entities may reveal diverse information. Both to explore and expose such a premise, we carry out a case study leveraging on multiple relations among Wikipedia articles. Specifically, we use expert-based similarity as well as other standard similarity measures, to discern the influence and impact of several factors which are hypothysed to generate controversies in Wikipedia articles. In the context of Wikipedia-specific research, our case study helps better differentiate the degree of impact of some of the possible causes of controversies.

Tài liệu tham khảo

Adafre SF, de Rijke M (2005) Discovering missing links in wikipedia. In: Proceedings of LinkKDD, ACM, New York, pp 90–97 Adler BT, de Alfaro L (2007) A content-driven reputation system for the wikipedia. In: Proceedings of WWW, pp 261–270 Allport GW (1979) The nature of prejudice. Basic books Bhattacharyya P, Garg A, Wu S (2010) Analysis of user keyword similarity in online social networks. In: Social network analysis and mining, pp 1–16 Brandes U, Lerner J (2007) Visual analysis of controversy in user-generated encyclopedias. Inf Vis 7(1):34–48 Brandes U, Kenis P, Lerner J, van Raaij D (2009) Network analysis of collaboration structure in wikipedia. In: Proceedings of WWW, pp 731–740 Bross J, Richly K, Kohnen M, Meinel C (2011) Identifying the top-dogs of the blogosphere. In: Social network analysis and mining, pp 1–15 Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of ACM KDD, pp 226–231 Hamouda S, Wanas N (2011) Put-tag: personalized user-centric tag recommendation for social bookmarking systems. Social network analysis and mining, pp 1–9 Jeh G, Widom J (2002) Simrank: A measure of structural-context similarity. In: Proceedings of ACM KDD, pp 538–543 Johnson DW, Johnson FP (2002) Joining together: group theory and group skills, 8th edn. Allyn & Bacon, Boston Kamps J, Koolen M (2008) The importance of link evidence in wikipedia. In: Proceedings of ECIR, pp 270–282 Kamps J, Koolen M (2009) Is wikipedia link structure different? In: Proceedings of WSDM, pp 232–241 Kittur A, Suh B, Pendleton BA, Chi EH (2007) He says, she says: Conflict and coordination in wikipedia. In: Proceedings of SIGCHI conference on CHI, pp 453–462 Le MT, Dang HV, Lim EP, Datta A (2008) Wikinetviz: Visualizing friends and adversaries in implicit social networks. In: Proceedings of intelligence and security informatics (ISI), pp 52–57 Lin YR, Sun J, Castro P, Konuru R, Sundaram H, Kelliher A (2009) Metafac: community discovery via relational hypergraph factorization. In: Proceedings of ACM KDD, Paris, pp 527–536 MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of Berkeley symposium on mathematical statistics and probability, pp 1:281–297 Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge University Press, New York Mill JS (1982) On liberty. Penguin Classics Myers D (2009) Social psychology, 10th edn. McGraw-Hill, New York Potthast M, Stein B, Gerling R (2008) Automatic vandalism detection in wikipedia. In: Proceedings of ECIR, pp 663–668 Rendle S, Balby Marinho L, Nanopoulos A, Lars ST (2009) Learning optimal ranking with tensor factorization for tag recommendation. In: Proceedings of ACM KDD, Paris, pp 727–736 Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining, 1st edn. Addison-Wesley Longman Publishing Co. Inc., Boston Vuong BQ, Lim EP, Sun A, Le MT, Lauw HW (2008) On ranking controversies in wikipedia: models and evaluation. In: Proceedings of WSDM, pp 171–182 West R, Precup D, Pineau J (2009a) Completing wikipedia’s hyperlink structure through dimensionality reduction. In: Proceedings of ACM CIKM, Hong Kong, pp 1097–1106 West R, Precup D, Pineau J (2009b) Completing wikipedia’s hyperlink structure through dimensionality reduction. In: Proceedings of CIKM, ACM, pp 1097–1106 Zhang Y, Sun A, Datta A, Chang K, Lim EP (2010) Do wikipedians follow domain experts? A domain-specific study on wikipedia knowledge building. In: Proceedings of JCDL, Gold Coast Zhao P, Han J, Sun Y (2009) P-rank: A comprehensive structural similarity measure over information networks. In: Proceedings of CIKM, pp 553–562 Zhou Y, Cong G, Cui B, Jensen CS, Yao J (2009) Routing questions to the right users in online communities. In: Proceedings of IEEE ICDE, Washington, pp 700–711