Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015
Tóm tắt
Influenza epidemics pose significant social and economic challenges in China. Internet search query data have been identified as a valuable source for the detection of emerging influenza epidemics. However, the selection of the search queries and the adoption of prediction methods are crucial challenges when it comes to improving predictions. The purpose of this study was to explore the application of the Support Vector Machine (SVM) regression model in merging search engine query data and traditional influenza data.
The official monthly reported number of influenza cases in Liaoning province in China was acquired from the China National Scientific Data Center for Public Health from January 2011 to December 2015. Based on Baidu Index, a publicly available search engine database, search queries potentially related to influenza over the corresponding period were identified. An SVM regression model was built to be used for predictions, and the choice of three parameters (
In total, 17 search queries related to influenza were generated through the initial query selection approach and were adopted to construct the SVM regression model, including nine queries in the same month, three queries at a lag of one month, one query at a lag of two months and four queries at a lag of three months. The SVM model performed well when with the parameters (
The results demonstrated the feasibility of using internet search engine query data as the complementary data source for influenza surveillance and the efficiency of SVM regression model in tracking the influenza epidemics in Liaoning.
Từ khóa
Tài liệu tham khảo
Allen, 2016, Applying GIS and machine learning methods to Twitter data for multiscale surveillance of influenza, PLOS ONE, 11, e157734, 10.1371/journal.pone.0157734
Bouzille, 2018, Leveraging hospital big data to monitor flu epidemics, Computer Methods and Programs in Biomedicine, 154, 153, 10.1016/j.cmpb.2017.11.012
China Internet Network Information Center, 2018, The 41st Statistical Report on Internet Development
Du, 2017, Predicting the hand, foot, and mouth disease incidence using search engine query data and climate variables: an ecological study in Guangdong, China, BMJ Open, 7, e16263, 10.1136/bmjopen-2017-016263
Fung, 2013, Chinese social media reaction to the MERS-CoV and avian influenza A(H7N9) outbreaks, Infectious Diseases of Poverty, 2, 31, 10.1186/2049-9957-2-31
Ghalehkhondabi, 2017, Water demand forecasting: review of soft computing methods, Environmental Monitoring and Assessment, 189, 313, 10.1007/s10661-017-6030-3
Ginsberg, 2009, Detecting influenza epidemics using search engine query data, Nature, 457, 1012, 10.1038/nature07634
Gomez-Barroso, 2017, Climatic factors and influenza transmission, Spain, 2010–2015, International Journal of Environmental Research and Public Health, 14, 1469, 10.3390/ijerph14121469
Gu, 2015, Early detection of an epidemic erythromelalgia outbreak using Baidu search data, Scientific Reports, 5, 12649, 10.1038/srep12649
Guo, 2017a, Developing a dengue forecast model using machine learning: a case study in China, PLOS Neglected Tropical Diseases, 11, e0005973, 10.1371/journal.pntd.0005973
Guo, 2017b, Monitoring seasonal influenza epidemics by using internet search data with an ensemble penalized regression model, Scientific Reports, 7, 46469, 10.1038/srep46469
Hickmann, 2015, Forecasting the 2013–2014 influenza season using Wikipedia, PLOS Computational Biology, 11, e1004239, 10.1371/journal.pcbi.1004239
Kagashe, 2017, Enhancing seasonal influenza surveillance: topic analysis of widely used medicinal drugs using Twitter data, Journal of Medical Internet Research, 19, e315, 10.2196/jmir.7393
Lampos, 2015, Advances in nowcasting influenza-like illness rates using search query logs, Scientific Reports, 5, 12760, 10.1038/srep12760
Li, 2017, Dengue Baidu search index data can improve the prediction of local dengue epidemic: a case study in Guangzhou, China, PLOS Neglected Tropical Diseases, 11, e0005354, 10.1371/journal.pntd.0005354
Liu, 2017a, Urban air quality forecasting based on multi-dimensional collaborative Support Vector Regression (SVR): a case study of Beijing-Tianjin-Shijiazhuang, PLOS ONE, 12, e0179763, 10.1371/journal.pone.0179763
Liu, 2017b, Identifying potential norovirus epidemics in China via internet surveillance, Journal of Medical Internet Research, 19, e282, 10.2196/jmir.7855
McIver, 2014, Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time, PLOS Computational Biology, 10, e1003581, 10.1371/journal.pcbi.1003581
National Health and Family Planning Commission of the People’s Republic of China, 2018, National Statutory Epidemic Situation in 2017
Nickerson, 2016, Deep neural network architectures for forecasting analgesic response, 2966
Olson, 2013, Reassessing Google Flu trends data for detection of seasonal and pandemic influenza: a comparative epidemiological study at three geographic scales, PLOS Computational Biology, 9, e1003256, 10.1371/journal.pcbi.1003256
Polgreen, 2008, Using internet searches for influenza surveillance, Clinical Infectious Diseases, 47, 1443, 10.1086/593098
Pollett, 2017, Evaluating Google flu trends in Latin America: important lessons for the next phase of digital disease detection, Clinical Infectious Diseases, 64, 34, 10.1093/cid/ciw657
Santillana, 2014, Using clinicians’ search query data to monitor influenza epidemics, Clinical Infectious Diseases, 59, 1446, 10.1093/cid/ciu647
Seo, 2014, Cumulative query method for influenza surveillance using search engine data, Journal of Medical Internet Research, 16, e289, 10.2196/jmir.3680
Seo, 2017, Methods using social media and search queries to predict infectious disease outbreaks, Healthcare Informatics Research, 23, 343, 10.4258/hir.2017.23.4.343
Shin, 2016, Correlation between national influenza surveillance data and search queries from mobile devices and desktops in South Korea, PLOS ONE, 11, e158539, 10.1371/journal.pone.0158539
Wagner, 2017, Estimating the population impact of a new pediatric influenza vaccination program in England using social media content, Journal of Medical Internet Research, 19, e416, 10.2196/jmir.8184
Wang, 2015, Socio-economic impact of influenza in children: a single-centered hospital study in Shanghai, Zhonghua Liu Xing Bing Xue Za Zhi, 36, 27
Wang, 2017, Epidemiological features and forecast model analysis for the morbidity of influenza in Ningbo, China, 2006–2014, International Journal of Environmental Research and Public Health, 14, 559, 10.3390/ijerph14060559
Woo, 2016, Estimating influenza outbreaks using both search engine query data and social media data in South Korea, Journal of Medical Internet Research, 18, e177, 10.2196/jmir.4955
World Health Organization, 2017, Up to 650,000 people die of respiratory diseases linked to seasonal flu each year
World Health Organization, 2018, Influenza (Seasonal)
Xu, 2017, Forecasting influenza in Hong Kong with Google search queries and statistical model fusion, PLOS ONE, 12, e0176690, 10.1371/journal.pone.0176690
Yang, 2015, The economic burden of influenza-associated outpatient visits and hospitalizations in China: a retrospective survey, Infectious Diseases of Poverty, 4, 44, 10.1186/s40249-015-0077-6
Yang, 2017, Using electronic health records and Internet search information for accurate influenza forecasting, BMC Infectious Diseases, 17, 332, 10.1186/s12879-017-2424-7
Yuan, 2013, Monitoring influenza epidemics in china with search query from baidu, PLOS ONE, 8, e64323, 10.1371/journal.pone.0064323
Yun, 2016, Social media and flu: media Twitter accounts as agenda setters, International Journal of Medical Informatics, 91, 67, 10.1016/j.ijmedinf.2016.04.009
Zhang, 2015, Leveraging social networking sites for disease surveillance and public sensing: the case of the 2013 avian influenza A(H7N9) outbreak in China, Western Pacific Surveillance and Response Journal, 6, 66, 10.5365/WPSAR.2015.6.1.013