Leveraging Transfer Learning to Analyze Opinions, Attitudes, and Behavioral Intentions Toward COVID-19 Vaccines: Social Media Content and Temporal Analysis

Journal of Medical Internet Research - Tập 23 Số 8 - Trang e30251
Siru Liu1, Jili Li2, Jialin Liu3
1Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
2West China Medical School, Sichuan University, Chengdu, China
3Department of Medical Informatics, West China Hospital, Sichuan University, Chengdu, China

Tóm tắt

Background The COVID-19 vaccine is considered to be the most promising approach to alleviate the pandemic. However, in recent surveys, acceptance of the COVID-19 vaccine has been low. To design more effective outreach interventions, there is an urgent need to understand public perceptions of COVID-19 vaccines. Objective Our objective was to analyze the potential of leveraging transfer learning to detect tweets containing opinions, attitudes, and behavioral intentions toward COVID-19 vaccines, and to explore temporal trends as well as automatically extract topics across a large number of tweets. Methods We developed machine learning and transfer learning models to classify tweets, followed by temporal analysis and topic modeling on a dataset of COVID-19 vaccine–related tweets posted from November 1, 2020 to January 31, 2021. We used the F1 values as the primary outcome to compare the performance of machine learning and transfer learning models. The statistical values and P values from the Augmented Dickey-Fuller test were used to assess whether users’ perceptions changed over time. The main topics in tweets were extracted by latent Dirichlet allocation analysis. Results We collected 2,678,372 tweets related to COVID-19 vaccines from 841,978 unique users and annotated 5000 tweets. The F1 values of transfer learning models were 0.792 (95% CI 0.789-0.795), 0.578 (95% CI 0.572-0.584), and 0.614 (95% CI 0.606-0.622) for these three tasks, which significantly outperformed the machine learning models (logistic regression, random forest, and support vector machine). The prevalence of tweets containing attitudes and behavioral intentions varied significantly over time. Specifically, tweets containing positive behavioral intentions increased significantly in December 2020. In addition, we selected tweets in the following categories: positive attitudes, negative attitudes, positive behavioral intentions, and negative behavioral intentions. We then identified 10 main topics and relevant terms for each category. Conclusions Overall, we provided a method to automatically analyze the public understanding of COVID-19 vaccines from real-time data in social media, which can be used to tailor educational programs and other interventions to effectively promote the public acceptance of COVID-19 vaccines.

Từ khóa


Tài liệu tham khảo

WHO Coronavirus Disease (COVID-19) DashboardWorld Health Organization20202021-01-31https://covid19.who.int/

10.1080/10410236.2020.1838096

10.1002/jmv.25965

10.1111/eci.13364

10.1136/bmj.m4576

10.1001/jamanetworkopen.2020.25594

ChodoshSWhy only half of Americans say they would get a COVID-19 vaccinePopular Science202006012021-01-31https://www.popsci.com/story/health/covid-19-vaccine-poll/

D'souzaGDowdyDWhat is herd immunity and how can we achieve it with COVID-19?Johns Hopkins Bloomberg School of Public Health20202021-01-31https://www.jhsph.edu/covid-19/articles/achieving-herd-immunity-with-covid19.html

10.1016/j.ijnurstu.2020.103854

10.1016/j.vaccine.2015.04.036

10.1080/21645515.2020.1780846

10.2196/28118

TysonAJohnsonCFunkCU.S. public now divided over whether to get COVID-19 vaccinePew Research Center20202021-01-31https://www.pewresearch.org/science/

10.1016/S1473-3099(20)30426-6

10.7759/cureus.7255

BrennenJSimonFHowardPNielsenRTypes, sources, and claims of COVID-19 misinformation key findingsReuters Institute, University of Oxford2021-01-30https://reutersinstitute.politics.ox.ac.uk/types-sources-and-claims-covid-19-misinformation

10.1007/s40037-019-00542-7

10.3390/ijerph15091974

10.5365/WPSAR.2015.6.1.013

10.1109/tnsm.2020.3031034

10.2196/18941

10.2196/18700

10.2196/18796

10.1080/21645515.2017.1360456

10.1001/jamanetworkopen.2020.22025

10.1371/journal.pone.0246306

LiuSFerraroJGundlapalliAVChapmanWBucherBDetection of healthcare-associated infections using electronic health record data201811AMIA 2018 Annual SymposiumNovember 3-7, 2018San Francisco, CA

Shi, J, 2019, AMIA Annu Symp Proc, 2019, 794

DevlinJChangMLeeKToutanovaKBERT: Pre-training of deep bidirectional transformers for language understanding201906NAACL-HLT 20192019Minneapolis, MN41714186

Demsar, J, 2006, J Mach Learn Res, 7, 1

10.1080/01621459.1979.10482531

HidayatullahAPembraniEKurniawanWAkbarGPranataRTwitter topic modeling on football news2018042018 3rd International Conference on Computer and Communication Systems2018Nagoya, Japan467471

Blei, DM, 2012, Communications of the ACM, 55, 77, 10.1145/2133806.2133826

10.3156/jsoft.24.4_160_1

SievertCShirleyKLDAvis: A method for visualizing and interpreting topics20140627Workshop on Interactive Language Learning, Visualization, and Interfaces2014Baltimore, MD6370

LevySWangWCross-lingual transfer learning for COVID-19 outbreak alignmentarXiv2021-02-11http://arxiv.org/abs/2006.03202

SpangherAPengNMayJFerraraEEnabling low-resource transfer learning across COVID-19 corpora by combining event-extraction and co-training2020ACL 2020 Work NLP-COVID 2020July 2020onlineAssociation for Computational Linguistics

TasneemFNaimJTasniaRHossainTChyACSECU-DSG at WNUT-2020 Task 2: Exploiting ensemble of transfer learning and hand-crafted features for identification of informative COVID-19 English tweets202011192020 EMNLP Workshop W-NUT: The Sixth Workshop on Noisy User-generated TextNovember 19, 2020Online394398

10.2196/22624

YinHYangSLiJDetecting topic and sentiment dynamics due to COVID-19 pandemic using social mediaarXiv202007052021-02-11http://arxiv.org/abs/2007.02304

10.1016/j.asoc.2020.106754

LiILiYLiTAlvarez-NapagaoSGarciaDWhat are we depressed about when we talk about COVID19: mental health analysis on tweets using natural language processingarXiv202006082021-02-11https://arxiv.org/abs/2004.10899

10.1186/s12911-021-01465-2

EvanegaSLynasMAdamsJSmolenyakKCoronavirus misinformation: quantifying sources and themes in the COVID-19 "infodemic"Alliance for Science2021-02-11https://allianceforscience.cornell.edu/wp-content/uploads/2020/10/Evanega-et-al-Coronavirus-misinformation-submitted_07_23_20-1.pdf

MüllerMSalathéMKummervoldPCOVID-Twitter-BERT: a natural language processing model to analyse COVID-19 content on TwitterarXiv202005152021-01-28https://arxiv.org/abs/2005.07503

10.1186/1748-5908-6-42