Tampering with Twitter’s Sample API

Jürgen Pfeffer1, Katja Mayer1, Fred Μorstatter2
1Bavarian School of Public Policy, Technical University of Munich, Munich, Germany
2Information Sciences Institute, University of Southern California, Marina del Rey, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Boyd D, Golder S, Lotan G (2010) Tweet, tweet, retweet: conversational aspects of retweeting on Twitter. In: System sciences (HICSS), 2010 43rd Hawaii international conference on. IEEE, New York, pp 1–10

Shirky C (2011) The political power of social media: Technology, the public sphere, and political change. Foreign affairs 28–41

Lazer D, Pentland A, Adamic L, Aral S, Barabási A-L, Brewer D, Christakis N, Contractor N, Fowler J, Gutmann M, Jebara T, King G, Macy M, Roy D, Alstyne MV (2009) Computational social science. Science 323(5915):721–723. https://doi.org/10.1126/science.1167742

Gayo-Avello D (2013) A meta-analysis of state-of-the-art electoral prediction from Twitter data. Soc Sci Comput Rev 31(6):649–679. https://doi.org/10.1177/0894439313493979

Palen L, Anderson KM (2016) Crisis informatics—new data for extraordinary times. Science 353(6296):224–225. https://doi.org/10.1126/science.aag2579

Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on World Wide Web. WWW ’10. ACM, New York, pp 851–860. https://doi.org/10.1145/1772690.1772777

Agichtein E, Castillo C, Donato D, Gionis A, Mishne G (2008) Finding high-quality content in social media. In: Proceedings of the 2008 international conference on web search and data mining. ACM, New York, pp 183–194

Steinert-Threlkeld ZC, Mocanu D, Vespignani A, Fowler J (2015) Online social networks and offline protest. EPJ Data Sci 4(1):19

Hughes AL, Palen L (2009) Twitter adoption and use in mass convergence and emergency events. Int J Emerg Manag 6(3/4):248–260

Olteanu A, Castillo C, Diaz F, Kiciman E (2016) Social data: Biases, methodological pitfalls, and ethical boundaries. SSRN Scholarly Paper ID 2886526, Social Science Research Network, Rochester, NY

Ruths D, Pfeffer J (2014) Social media for large studies of behavior. Science 346(6213):1063–1064

González-Bailón S, Wang N, Rivero A, Borge-Holthoefer J, Moreno Y (2014) Assessing the bias in samples of large online networks. Soc Netw 38:16–27. https://doi.org/10.1016/j.socnet.2014.01.004

Bruns A, Stieglitz S (2012) Quantitative approaches to comparing communication patterns on Twitter. J. Technol. Hum. Serv. 30(3–4):160–185. https://doi.org/10.1080/15228835.2012.744249

Driscoll K, Walker S (2014) Big data, big questions—working within a black box: transparency in the collection and production of big Twitter data. Int J Commun 8:20

Burgess J, Bruns A (2015) Easy data, hard data: the policies and pragmatics of Twitter research after the computational turn. In: Compromised data: from social media to big data, pp 93–111

Elmer G, Langlois G, Redden J (2015) Compromised data: from social media to big data. Bloomsbury Publishing, New York

Gaffney D, Puschmann C (2013) Data collection on Twitter, pp. 55–67. Peter Lang, New York

Howison J, Wiggins A, Crowston K (2011) Validity issues in the use of social network analysis with digital trace data. J Assoc Inf Syst 12:2

Hannak A, Soeller G, Lazer D, Mislove A, Wilson C (2014) Measuring price discrimination and steering on e-commerce web sites. In: Proceedings of the 2014 conference on Internet measurement conference, pp 305–318

King G (2011) Ensuring the data rich future of the social sciences. Science 331:719–721

Chen L, Mislove A, Wilson C (2015) Peeking beneath the hood of uber. In: Proceedings of the 2015 Internet measurement conference. IMC ’15. ACM, New York, pp 495–508. https://doi.org/10.1145/2815675.2815681

Eslami M, Rickman A, Vaccaro K, Aleyasen A, Vuong A, Karahalios K, Hamilton K, Sandvig C (2015) I always assumed that I wasn’t really that close to [her]: reasoning about invisible algorithms in news feeds. In: Proceedings of the 33rd annual ACM conference on human factors in computing systems, pp 153–162

Williams SA, Terras MM, Warwick C (2013) What do people study when they study Twitter? Classifying Twitter related academic papers. J Doc 69(3):384–410

Zimmer M, Proferes NJ (2014) A topology of Twitter research: disciplines, methods, and ethics. Aslib J Inf Manag 66(3):250–261

Rosenthal S, Farra N, Nakov P (2017) Semeval-2017 task 4: sentiment analysis in Twitter. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017), pp 502–518

Bastos MT (2015) Shares, pins, and tweets: news readership from daily papers to social media. Journalism Studies 16(3):305–325

Newman N, Levy D, Nielsen RK (2016) Digital news report 2016. Reuters Institute for the Study of Journalism

Nielsen RK, Schrøder KC (2014) The relative importance of social media for accessing, finding, and engaging with news: an eight-country cross-media comparison. Digital Journalism 2(4):472–489

Ausserhofer J, Maireder A (2013) National politics on Twitter: structures and topics of a networked public sphere. Inf Commun Soc 16(3):291–314

Neuberger C, vom Hofe J, Nuernbergk C (2014) The use of Twitter by professional journalists. results of a newsroom survey in Germany. In: Weller K, Bruns A, Burgess J, Mahrt M, Puschmann C (eds) Twitter and society. Peter Lang, New York, pp 345–357

Lasorsa DL, Lewis SC, Holton AE (2012) Normalizing Twitter: journalism practice in an emerging communication space. Journalism Studies 13(1):19–36

Varol O, Ferrara E, Menczer F, Flammini A (2017) Early detection of promoted campaigns on social media. EPJ Data Sci 6(1):13

O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. ICWSM 11(1–2):122–129

Wang W, Chen L, Thirunarayan K, Sheth AP (2014) Cursing in English on Twitter. In: Proceedings of the 17th ACM conference on computer supported cooperative work and social computing, pp 415–425

Shao C, Hui P-M, Wang L, Jiang X, Flammini A, Menczer F, Ciampaglia GL (2018) Anatomy of an online misinformation network. PLoS ONE 13(4), e0196087

Tufekci Z (2014) Big questions for social media big data: representativeness, validity and other methodological pitfalls. In: Proceedings of the eigth international AAAI conference on weblogs and social medi. AAAI Press, Menlo Park, pp 505–514

Mislove A, Lehmann S, Ahn Y-Y, Onnela J-P, Rosenquist JN (2011) Understanding the demographics of Twitter users. In: Proceedings of the fifth international AAAI conference on weblogs and social media, pp 554–557

Malik MM, Lamba H, Nakos C, Pfeffer J (2015) Population bias in geotagged tweets. In: ICWSM workshop on standards and practices in large-scale social media research

Crawford K, Finn M (2015) The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters. GeoJournal 80(4):491–502

Malik MM, Pfeffer J (2016) Identifying platform effects in social media data

Lazer D, Kennedy R, King G, Vespignani A (2014) The parable of Google flu: traps in big data analysis. Science 343(6176):1203–1205. https://doi.org/10.1126/science.1248506

Kergl D, Roedler R, Seeber S (2014) On the endogenesis of Twitter’s Spritzer and Gardenhose sample streams. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 357–364. https://doi.org/10.1109/ASONAM.2014.6921610

Morstatter F, Dani H, Sampson J, Liu H (2016) Can one tamper with the sample api?: toward neutralizing bias from spam and bot content. In: Proceedings of the 25th international world wide web conference, pp 81–82

Burgess J, Bruns A (2016) Easy data, hard data: the policies and pragmatics of Twitter research after the computational turn. In: Compromised data: from social media to big data, pp 93–111

Elmer G, Langlois G, Redden J (2015) Compromised data: from social media to big data. Bloomsbury Publishing, USA

Weller K, Bruns A, Burgess J, Mahrt M, Puschmann C Twitter and Society, vol. 89. P. Lang

Joseph K, Landwehr PM, Carley KM (2014) Two 1%s don’t make a whole: comparing simultaneous samples from Twitter’s streaming api. In: International conference on social computing, behavioral-cultural modeling, and prediction. Springer, Berlin, pp 75–83

Yates A, Kolcz A, Goharian N, Frieder O (2016) Effects of sampling on Twitter trend detection. In: Proceedings of the tenth international conference on language resources and evaluation (LREC 2016), Paris, France

Morstatter F, Pfeffer J, Liu H, Carley KM (2013) Is the sample good enough? Comparing data from Twitter’s streaming api with Twitter’s firehose. In: Seventh international AAAI conference on weblogs and social media

Cihon P, Yasseri T (2016) A biased review of biases in twitter studies on political collective action. At the Crossroads: lessons and Challenges in Computational Social Science 91

Morstatter F, Pfeffer J, Liu H (2014) When is it biased?: assessing the representativeness of Twitter’s streaming api. In: Proceedings of the 23rd international conference on World Wide Web. ACM, Seoul, pp 555–556

Crawford K, Gray ML, Miltner K (2014) Big data—critiquing big data: politics, ethics, epistemology—special section introduction. Int J Commun 8:10

Gerlitz C, Rieder B (2013) Mining one percent of Twitter: collections, baselines, sampling. M/C Journal 16(2):1–18

Wagner C, Singer P, Karimi F, Pfeffer J, Strohmaier M (2017) Sampling from social networks with attributes. In: Proceedings of the 26th international conference on World Wide Web. WWW ’17, pp 1181–1190

Lamba H, Hooi B, Shin K, Falousos C, Pfeffer J (2017) Zoorank: ranking suspicious activities in time-evolving tensors. In: ECML PKDD, the European conference on machine learning and principles and practice of knowledge discovery in databases (ECML-PKDD)

Lee K, Eoff BD, Caverlee J (2011) Seven months with the devils: a long-term study of content polluters on twitter

Ferrara E, Varol O, Davis C, Menczer F, Flammini A (2016) The rise of social bots. Commun ACM 59(7):96–104

Varol O, Ferrara E, Davis CA, Menczer F, Flammini A (2017) Online human-bot interactions: detection, estimation, and characterization pp 280–289

Hegelich S, Janetzko D (2016) Are social bots on twitter political actors? Empirical evidence from a ukrainian social botnet pp 579–582

Ratkiewicz J, Conover M, Meiss M, Gonçalves B, Flammini A, Menczer F (2011) Detecting and tracking political abuse in social media. In: ICWSM

Lee S, Kim J (2014) Early filtering of ephemeral malicious accounts on Twitter. Comput Commun 54:48–57

Chu Z, Gianvecchio S, Wang H, Jajodia S (2010) Who is tweeting on Twitter: human, bot, or cyborg?. In: Proceedings of the 26th annual computer security applications conference. ACM, New York, pp 21–30

Lee K, Eoff BD, Caverlee J (2011) Seven months with the devils: a long-term study of content polluters on Twitter. In: ICWSM. Citeseer

Agarwal A, Xie B, Vovsha I, Rambow O, Passonneau R (2011) Sentiment analysis of Twitter data. In: Proceedings of the workshop on languages in social media. LSM ’11. Association for Computational Linguistics, Stroudsburg, pp 30–38

Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: International conference on language resources and evaluation, Valetta, Malta

Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with Twitter: what 140 characters reveal about political sentiment. In: Fourth international AAAI conference on weblogs and social media

Wang H, Can D, Kazemzadeh A, Bar F, Narayanan S (2012) A system for real-time Twitter sentiment analysis of 2012 U.S. presidential election cycle. In: Proceedings of the ACL 2012 system demonstrations. ACL ’12. Association for Computational Linguistics, Stroudsburg, pp 115–120

Pennebaker JW, Booth RJ, Francis ME (2007) Linguistic inquiry and word count: Liwc [computer software]. Austin, TX: liwc. net

Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth international AAAI conference on weblogs and social media

de Saint-Exupéry A (1943) The Little Prince. Reynal & Hitchcock, New York

Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1-2), 1–135

Howard PN, Kollanyi B, Woolley S (2016) Bots and automation over twitter during the us election. Computational Propaganda Project: working Paper Series

Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423623656

Daneshpazhouh A, Sami A (2014) Entropy-based outlier detection using semi-supervised approach with few positive examples. Pattern Recognit Lett 49:77–84

Echeverria J, Zhou S (2017) Discovery, retrieval, and analysis of the ’star wars’ botnet in Twitter. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 1–8

Mayer-Schönberger V, Cukier K (2013) Big data: a revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt, New York