A case study of using natural language processing to extract consumer insights from tweets in American cities for public health crises

BMC Public Health - Tập 23 - Trang 1-16 - 2023
Ye Wang1, Erin Willis2, Vijaya K. Yeruva3, Duy Ho3, Yugyung Lee3
1Department of Communication and Journalism, University of Missouri-Kansas City, Kansas City, USA
2Department of Advertising, Public Relations & Media Design, University of Colorado Boulder, Boulder, USA
3Division of Computing, Analytics, and Mathematics, University of Missouri-Kansas City, Kansas City, USA

Tóm tắt

The COVID-19 pandemic was a “wake up” call for public health agencies. Often, these agencies are ill-prepared to communicate with target audiences clearly and effectively for community-level activations and safety operations. The obstacle is a lack of data-driven approaches to obtaining insights from local community stakeholders. Thus, this study suggests a focus on listening at local levels given the abundance of geo-marked data and presents a methodological solution to extracting consumer insights from unstructured text data for health communication. This study demonstrates how to combine human and Natural Language Processing (NLP) machine analyses to reliably extract meaningful consumer insights from tweets about COVID and the vaccine. This case study employed Latent Dirichlet Allocation (LDA) topic modeling, Bidirectional Encoder Representations from Transformers (BERT) emotion analysis, and human textual analysis and examined 180,128 tweets scraped by Twitter Application Programming Interface’s (API) keyword function from January 2020 to June 2021. The samples came from four medium-sized American cities with larger populations of people of color. The NLP method discovered four topic trends: “COVID Vaccines,” “Politics,” “Mitigation Measures,” and “Community/Local Issues,” and emotion changes over time. The human textual analysis profiled the discussions in the selected four markets to add some depth to our understanding of the uniqueness of the different challenges experienced. This study ultimately demonstrates that our method used here could efficiently reduce a large amount of community feedback (e.g., tweets, social media data) by NLP and ensure contextualization and richness with human interpretation. Recommendations on communicating vaccination are offered based on the findings: (1) the strategic objective should be empowering the public; (2) the message should have local relevance; and, (3) communication needs to be timely.

Tài liệu tham khảo

Griffith J, Marani H, Monkman H. COVID-19 vaccine hesitancy in Canada: content analysis of tweets using the theoretical domains framework. J Med Internet Res. 2021;23(4):e26874. Barello S, Palamenghi L, Graffigna G. Looking inside the ‘black box’ of vaccine hesitancy: unlocking the effect of psychological attitudes and beliefs on COVID-19 vaccine acceptance and implications for public health communication. Psychol Med. 2023;53(3):1120–1. Opel DJ, Marcuse EK. Rethinking vaccine policy making in an era of vaccine hesitancy: time to rebuild, not remodel? Hum Vaccin Immunother. 2013;9(12):2672–3. World Health Organization. Global influenza strategy 2019-2030. Shim JG, Ryu KH, Lee SH, Cho EA, Lee YJ, Ahn JH. Text mining approaches to analyze public sentiment changes regarding COVID-19 vaccines on social media in Korea. Int J Environ Res Public Health. 2021;18(12):6549. Kwok SW, Vadde SK, Wang G. Tweet topics and sentiments relating to COVID-19 vaccination among Australian Twitter users: machine learning analysis. J Med Internet Res. 2021;23(5):e26953. Cotfas LA, Delcea C, Roxin I, Ioanăş C, Gherai DS, Tajariol F. The longest month: analyzing COVID-19 vaccination opinions dynamics from tweets in the month following the first vaccine announcement. Ieee Access. 2021;9:33203–23. Lyu JC, Le Han E, Luli GK. COVID-19 vaccine–related discussion on Twitter: topic modeling and sentiment analysis. J Med Internet Res. 2021;23(6):e24435. Ihlen Ø, Toledano M, Just SN. Using rhetorical situations to examine and improve vaccination communication. Front Commun. 2021;6:130. Salmon D, Opel DJ, Dudley MZ, Brewer J, Breiman R. Reflections on governance, communication, and equity: challenges and opportunities In COVID-19 vaccination: article examines the engagement and communication steps necessary to strengthen the COVID-19 vaccine roll out by federal, state, and local governments. Health Aff. 2021;40(3):419–25. Collins HM, Evans R. The third wave of science studies: studies of expertise and experience. Soc Stud Sci. 2002;32(2):235–96. Opel DJ, Marcuse EK. Window or mirror: social networks’ role in immunization decisions. Pediatrics. 2013;131(5):e1619. Hegger D, Lamers M, Van Zeijl-Rozema A, Dieperink C. Conceptualising joint knowledge production in regional climate change adaptation projects: success conditions and levers for action. Environ Sci Policy. 2012;18:52–65. Sallis JF, Owen N, Fisher E. Ecological models of health behavior. Health Behav Theory Res Pract. 2015;5:43–64. Qazi A, Qazi J, Naseer K, Zeeshan M, Hardaker G, Maitama JZ, Haruna K. Analyzing situational awareness through public opinion to predict adoption of social distancing amid pandemic COVID-19. J Med Virol. 2020;92(7):849–55. Kim HM, Saffer AJ, Liu W, Sun J, Li Y, Zhen L, Yang A. How public health agencies break through COVID-19 conversations: a strategic network approach to public engagement. Health Commun. 2022;37(10):1267-84. Kotliar DM. Depression narratives in blogs: a collaborative quest for coherence. Qual Health Res. 2016;26(9):1203–15. Wang Y, Willis E. Examining theory-based behavior-change constructs, social interaction, and sociability features of the Weight Watchers’ online community. Health Educ Behav. 2016;43(6):656–64. Teoh D. The power of social media for HPV vaccination–not fake news! Am Soc Clin Oncol Educ Book. 2019;39:75–8. Yang SU. Effects of government dialogic competency: The MERS outbreak and implications for public health crises and political legitimacy. J Mass Commun Quar. 2018;95(4):1011–32. McMillan DW, Chavis DM. Sense of community: a definition and theory. J Community Psychol. 1986;14(1):6–23. U.S. Census. Jackson County, Missouri; Kansas City, Missouri. 2019. https://www.census.gov/quickfacts/fact/table/jacksoncountymissouri,kansascitycitymissouri/PST045219. Accessed 18 Jan 2022. Belshe S. KC health officials work to increase COVID vaccination rates as Delta variant spreads, Missouri Independent. 2021. https://missouriindependent.com/2021/08/02/kc-health-officials-work-to-increase-covid-vaccination-rates-as-delta-variant-spreads/. Accessed 18 Jan 2022. Fortino, J. May 10. Jackson County Oks $5M to ramp up COVID vaccinations on Kansas City’s East side. KCUR.org. 2021. https://www.kcur.org/health/2021-05-10/jackson-county-oks-5-million-project-to-ramp-up-covid-19-vaccinations-on-kansas-citys-east-side. Accessed 18 Jan 2022. Statista. Most popular social networks worldwide as of July 2021, ranked by number of active users. 2021. https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/. Accessed 2 Mar 2022. Omnicore. Twitter by the numbers: Stats, demographics & fund facts. Retrieved from https://www.omnicoreagency.com/twitter-statistics/. 2021. Accessed 1 Mar 2022. Wojcik S, Hughes A. Sizing up Twitter users. PEW research center. 2019:1-23. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of machine Learning research. 2003;3(Jan):993-1022. Aletras N, Stevenson M. Evaluating topic coherence using distributional semantics. In Proceedings of the 10th international conference on computational semantics (IWCS 2013)–Long Papers. 2013. pp. 13–22. Sievert C, Shirley K. LDAvis: a method for visualizing and interpreting topics. In Proceedings of the workshop on interactive language learning, visualization, and interfaces. 2014. pp. 63–70. Shao, S. Contextual topic identification for steam reviews. 2020. https://github.com/Stveshawn/contextual_topic_identification. Accessed 6 Nov 2021. Hugging Face. The AI community building the future. URL: https://huggingface.co. 2021. Reimers N, Gurevych I. Sentence-bert: Sentence embeddings using SiameseBERT-networks. arXiv preprint arXiv:1908.10084. 2019. Hu X, Tang J, Gao H, Liu H. Unsupervised sentiment analysis with emotional signals. In Proceedings of the 22nd international conference on World Wide Web. 2013. pp. 607–618. Liu B. Sentiment analysis: Mining opinions, sentiments, and emotions. Cambridge, United Kingdom: Cambridge university press; 2020. Berger, A.A. Media research techniques (2nd ed). Thousand Oaks, California: Sage; 1998. McKee, A. Textual analysis: a beginner’s guide. Thousand Oaks, California: Sage; 2003. Miles, M.B., & Huberman, A.M. Qualitative data analysis: an expanded sourcebook. Thousand Oaks, California: Sage; 1994. Bertakis KD, Azari R. Patient-centered care is associated with decreased health care utilization. J Am Board Fam Med. 2011;24(3):229–39. Epstein RM. The science of patient-centered care. J Fam Pract. 2000;49(9):805. Benetoli A, Chen TF, Aslani P. How patients’ use of social media impacts their interactions with healthcare professionals. Patient Educ Couns. 2018;101(3):439–44. Chen YY, Li CM, Liang JC, Tsai CC. Health information obtained from the internet and changes in medical decision making: questionnaire development and cross-sectional survey. J Med Internet Res. 2018;20(2):e9370. Vraga EK, Radzikowski JR, Stefanidis A, Croitoru A, Crooks AT, Delamater PL, Pfoser D, Jacobsen KH. Social media engagement with cancer awareness campaigns declined during the 2016 US presidential election. World Med Health Policy. 2017;9(4):456–65. Fortini-Campbell L. The consumer insight workbook: how consumer insights can inspire better marketing and advertising. J Consumer Market. 1992;9(4):73-74. Southgate D. The emergence of generation Z and its impact in advertising: long-term implications for media planning and creative development. J Advert Res. 2017;57(2):227–35. Grybauskas A, Pilinkienė V, Stundžienė A. Predictive analytics using Big Data for the real estate market during the COVID-19 pandemic. J Big Data. 2021;8(1):1–20. Bowen J, Whalen E. Trends that are changing travel and tourism. Worldw Hosp Tour Themes. 2017;9:592–602. WHATT-09-2017-0045. Colladon AF, Guardabascio B, Innarella R. Using social network and semantic analysis to analyze online travel forums and forecast tourism demand. Decis Support Syst. 2019;123:113075. Brabers AE, Rademakers JJ, Groenewegen PP, Van Dijk L, De Jong JD. What role does health literacy play in patients’ involvement in medical decision-making? PLoS One. 2017;12(3):e0173316. Hickey KT, Creber RM, Reading M, Sciacca RR, Riga TC, Frulla AP, Casida JM. Low health literacy: implications for managing cardiac patients in practice. Nurse Pract. 2018;43(8):49.