Fantazzini, Dean (2020): Short-term forecasting of the COVID-19 pandemic using Google Trends data: Evidence from 158 countries. Forthcoming in: Applied Econometrics (2020): 1 -20.
Preview |
PDF
MPRA_paper_102315.pdf Download (569kB) | Preview |
Abstract
The ability of Google Trends data to forecast the number of new daily cases and deaths of COVID-19 is examined using a dataset of 158 countries. The analysis includes the computations of lag correlations between confirmed cases and Google data, Granger causality tests, and an out-of-sample forecasting exercise with 18 competing models with a forecast horizon of 14 days ahead. This evidence shows that Google-augmented models outperform the competing models for most of the countries. This is significant because Google data can complement epidemiological models during difficult times like the ongoing COVID-19 pandemic, when official statistics maybe not fully reliable and/or published with a delay. Moreover, real-time tracking with online-data is one of the instruments that can be used to keep the situation under control when national lockdowns are lifted and economies gradually reopen.
Item Type: | MPRA Paper |
---|---|
Original Title: | Short-term forecasting of the COVID-19 pandemic using Google Trends data: Evidence from 158 countries |
Language: | English |
Keywords: | Covid-19; Google Trends; VAR; ARIMA; ARIMA-X; ETS; LASSO; SIR model |
Subjects: | C - Mathematical and Quantitative Methods > C2 - Single Equation Models ; Single Variables > C22 - Time-Series Models ; Dynamic Quantile Regressions ; Dynamic Treatment Effect Models ; Diffusion Processes C - Mathematical and Quantitative Methods > C3 - Multiple or Simultaneous Equation Models ; Multiple Variables > C32 - Time-Series Models ; Dynamic Quantile Regressions ; Dynamic Treatment Effect Models ; Diffusion Processes ; State Space Models C - Mathematical and Quantitative Methods > C5 - Econometric Modeling > C51 - Model Construction and Estimation C - Mathematical and Quantitative Methods > C5 - Econometric Modeling > C53 - Forecasting and Prediction Methods ; Simulation Methods G - Financial Economics > G1 - General Financial Markets > G17 - Financial Forecasting and Simulation I - Health, Education, and Welfare > I1 - Health > I18 - Government Policy ; Regulation ; Public Health I - Health, Education, and Welfare > I1 - Health > I19 - Other |
Item ID: | 102315 |
Depositing User: | Prof. Dean Fantazzini |
Date Deposited: | 10 Aug 2020 07:50 |
Last Modified: | 10 Aug 2020 07:50 |
References: | Alamo T., Reina D.G., Mammarella M., Abella A. (2020). Covid-19: Open-Data Resources for Monitoring, Modeling, and Forecasting the Epidemic. Electronics, 9(5), 827. Anastassopoulou C., Russo L., Tsakris A., Siettos C. (2020). Data-based analysis, modelling and forecasting of the COVID-19 outbreak. PloS one, 15(3), e0230405. Ayers J.W., Althouse B.M., Allem J.P., Rosenquist J.N., Ford D.E. (2013). Seasonality in seeking mental health information on Google. American Journal of Preventive Medicine, 44(5), 520-525. Ayyoubzadeh S.M., Ayyoubzadeh S.M., Zahedi H., Ahmadi M., Kalhori, S.R.N. (2020). Predicting COVID-19 incidence through analysis of google trends data in iran: data mining and deep learning pilot study. JMIR Public Health and Surveillance, 6(2), e18828. Birrell P.J., Ketsetzis G., Gay N.J., Cooper B.S., Presanis A.M., Harris R.J, Charlett A., Zhang X.S., White P., Pebody R., De Angelis D. (2011). Bayesian modeling to unmask and predict in uenza A/H1N1pdm dynamics in London. Proceedings of the National Academy of Sciences, 108(45), 18238-18243. Boyle J.R., Sparks R.S., Keijzers G.B., Crilly J. L., Lind J. F., Ryan L.M. (2011). Prediction and surveillance of influenza epidemics. Medical journal of Australia, 194, S28-S33. Brauer F., Castillo-Chavez C., Castillo-Chavez C. (2012). Mathematical models in population biology and epidemiology. New York: Springer. Broniatowski D.A., Paul M.J., Dredze M. (2013). National and local influenza surveillance through Twitter: an analysis of the 2012-2013 influenza epidemic. PloS one, 8(12). Cazelles B., Champagne C., Dureau J. (2018). Accounting for non-stationarity in epidemiology by embedding time-varying parameters in stochastic models. PLoS computational biology, 14(8), e1006211. Clemen R.T. (1989). Combining forecasts: A review and annotated bibliography. International journal of forecasting, 5(4), 559-583. Costantini M., Lupi C. (2013). A Simple Panel-CADF Test for Unit Roots. Oxford Bulletin of Economics and Statistics, 75(2), 276-296. D'Amuri F., Marcucci J. (2017). The predictive power of Google searches in forecasting US unemployment. International Journal of Forecasting, 33(4), 801-816. Dugas A.F., Jalalpour M., Gel Y., Levin S., Torcaso F., Igusa T., Rothman R.E. (2013). Influenza forecasting with Google flu trends. PloS one, 8(2), e56176. Eichenbaum M.S., Rebelo S., Trabandt M. (2020a). The Macroeconomics of Epidemics. National Bureau of Economic Research, No. w26882. Eichenbaum M.S., Rebelo S., Trabandt M. (2020b). The Macroeconomics of Testing and Quarantining. National Bureau of Economic Research, No. w27104. Eichenbaum M.S., Rebelo S., Trabandt M. (2020c). Epidemics in the Neoclassical and New Keynesian Models. National Bureau of Economic Research, No. w27430. European Centre for Disease Prevention and Control (ECDC, 2020). Download today's data on the geographic distribution of COVID-19 cases worldwide. Available from: https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide. Fantazzini D. (2014). Nowcasting and forecasting the monthly food stamps data in the US using online search data. PloS one, 9(11), e111894. Fantazzini, D. (2019). Quantitative �finance with R and cryptocurrencies. Amazon KDP, ISBN-13, 978-1090685315. Fantazzini D., Toktamysova Z. (2015). Forecasting German car sales using Google data and multivariate models. International Journal of Production Economics, 170, 97-135. Franses P.H., Paap R. (2004). Periodic time series models. OUP Oxford. Gianfredi V., Bragazzi N.L., Mahamid M., Bisharat B., Mahroum N., Amital H., Adawi M. (2018). Monitoring public interest toward pertussis outbreaks: an extensive Google Trends{based analysis. Public Health, 165, 9-15. Ginsberg J., Mohebbi M.H., Patel R.S., Brammer L., Smolinski M.S., Brilliant L. (2009). Detecting influenza epidemics using search engine query data. Nature, 457(7232), 1012-1014. Granger C.W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 424-438. Granger C.W. (1980). Testing for causality: a personal viewpoint. Journal of Economic Dynamics and control, 2, 329-352. Granger C.W., Newbold P. (1974). Spurious regressions in econometrics, Journal of Econometrics, 2, 111-120. 22 Hacker R.S., Hatemi J.A. (2006). Tests for causality between integrated variables using asymptotic and bootstrap distributions: theory and application. Applied Economics, 38(13), 1489-1500. Haldrup N., Lildholdt P. (2002). On the robustness of unit root tests in the presence of double unit roots. Journal of Time Series Analysis, 23(2), 155-171. Hall I.M., Gani R., Hughes H.E., Leach S. (2007). Real-time epidemic forecasting for pandemic influenza. Epidemiology and Infection, 135(3), 372-385. Hansen P.R., Lunde A., Nason J.M. (2011). The model con�dence set. Econometrica, 79(2), 453-497. Ho H.T., Carvajal T.M., Bautista J.R., Capistrano J.D.R., Viacrusis K.M., Hernandez L.F.T., Watanabe K. (2018). Using Google Trends to examine the spatio-temporal incidence and behavioral patterns of dengue disease: A case study in Metropolitan Manila, Philippines. Tropical medicine and infectious disease, 3(4), 118. Hsiao C., Wan S.K. (2014). Is there an optimal forecast combination? Journal of Econometrics, 178, 294-309. Hyndman R.J., Koehler A.B., Snyder R.D., Grose,S. (2002). A state space framework for automatic forecasting using exponential smoothing methods. International Journal of forecasting, 18(3), 439-454. Hyndman R., Koehler A.B., Ord J.K., Snyder R.D. (2008). Forecasting with exponential smoothing: the state space approach. Springer Science and Business Media. Hyndman R.J., Athanasopoulos G. (2018). Forecasting: principles and practice. OTexts. Im K.S., Pesaran M.H., Shin Y. (2003). Testing for unit roots in heterogeneous panels. Journal of econometrics, 115(1), 53-74. Kermack W.O., McKendrick A. G. (1927). A contribution to the mathematical theory of epidemics. Proceedings of the royal society of london. Series A, 115(772), 700-721. Lazer D., Kennedy R., King G., Vespignani A. (2014). The parable of Google Flu: traps in big data analysis. Science, 343(6176), 1203-1205. Lehmann E.L., Casella G. (1998). Theory of point estimation. Springer Science and Business Media. Levin A., Lin C. F., Chu C.S.J. (2002). Unit root tests in panel data: asymptotic and �nite-sample properties. Journal of econometrics, 108(1), 1-24. Li Q., Guan X., Wu P., Wang X., Zhou L., Tong Y., Ren R, Leung K.S., Lau E.H., Wong J.Y., Xing X. (2020a). Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. New England Journal of Medicine, 382(13), 1199-1207. Li C., Chen L.J., Chen X., Zhang M., Pang C.P., Chen H. (2020b). Retrospective analysis of the possibility of predicting the COVID-19 outbreak from Internet searches and social media data, China, 2020. Eurosurveillance, 25(10), 2000199. Longini I.M., Fine P.E., Thacker S.B. (1986). Predicting the Global Spread of new infectious agents. American Journal of Epidemiology, 123(3): 383-39. Lutkepohl, H. (2005). New introduction to multiple time series analysis. Springer Science and Business Media. Maddala G.S., Wu S. (1999). A comparative study of unit root tests with panel data and a new simple test. Oxford Bulletin of Economics and statistics, 61(S1), 631-652. Makridakis S., Hibon M. (2000). The M3-Competition: results, conclusions and implications. International journal of forecasting, 16(4), 451-476. Makridakis S., Spiliotis E., Assimakopoulos V. (2020). The M4 Competition: 100,000 time series and 61 forecasting methods. International Journal of Forecasting, 36(4), 54-74. Majumder M.S., Santillana M., Mekaru S.R., McGinnis D.P., Khan K., Brownstein J.S. (2016). Utilizing nontraditional data sources for near real-time estimation of transmission dynamics during the 2015-2016 Colombian Zika virus disease outbreak. JMIR public health and surveillance, 2(1), e30. Milinovich G.J., Williams G.M., Clements A.C., Hu W. (2014). Internet-based surveillance systems for monitoring emerging infectious diseases. The Lancet infectious diseases, 14(2), 160-168. Marques-Toledo C., Degener C.M., Vinhal L., Coelho G., Meira W., Code�co C.T., Teixeira M.M. (2017). Dengue prediction by the web: Tweets are a useful tool for estimating and forecasting Dengue at country and city level. PLoS neglected tropical diseases, 11(7), e0005729. Nicholson W.B., Matteson D.S., Bien J. (2017). VARX-L: Structured regularization for large vector autoregressions with exogenous variables. International Journal of Forecasting, 33(3), 627-51. Nicholson W. B., Wilms I., Bien J., Matteson D. S. (2018). High dimensional forecasting via interpretable vector autoregression. arXiv preprint, arXiv:1412.5250 Nicholson, w., Matteson, D., and Bien, J. (2019). BigVAR: Dimension Reduction Methods for Multivariate Time Series. R package version 1.0.6. Park J.Y., Phillips P.C. (1989). Statistical inference in regressions with integrated processes: Part 2. Econometric Theory, 5(1), 95-131. Phillips P.C. (1986). Understanding spurious regressions in econometrics. Journal of Econometrics, 33(3), 311-340. Santangelo, O. E., Provenzano, S., Piazza, D., Giordano, D., Calamusa, G., Firenze, A. (2019). Digital epidemiology: assessment of measles infection through Google Trends mechanism in Italy. Annali di Igiene, 31, 385-391. Santillana M., Nguyen A.T., Dredze M., Paul M.J., Nsoesie E.O., Brownstein J.S. (2015). Combining search, social media, and traditional data sources to improve influenza surveillance. PLoS computational biology, 11(10). Seifter A., Schwarzwalder A., Geis K., Aucott J. (2010). The utility of \Google Trends" for epidemiological research: Lyme disease as an example. Geospatial health, 4, 135-137. Shin S.Y., Seo D.W., An J., Kwak H., Kim S.H., Gwack J., Jo M.W. (2016). High correlation of Middle East respiratory syndrome spread with Google search and Twitter trends in Korea. Scienti�c reports, 6, 32920. Sims C.A., Stock J.H., Watson M.W. (1990). Inference in linear time series models with some unit roots. Econometrica, 58, 133-144. Soetaert K., Petzoldt T., Setzer R.W. (2010a) Solving Di�erential Equations in R. The R Journal, 2(2), 5-15. Soetaert K.E., Petzoldt T., Setzer, R.W. (2010b). Solving di�erential equations in R: package deSolve. Journal of Statistical Software, 33, 1-25. Stock J., Watson M. (1989). Interpreting the Evidence on Money-Income Causality. Journal of Econometrics. 40(1) 161-182. Timmermann A. (2006). Forecast combinations. Handbook of economic forecasting, 1, 135-196. Teng Y., Bi D., Xie G., Jin Y., Huang Y., Lin B., An X., Feng D., Tong, Y. (2017). Dynamic forecasting of Zika epidemics using Google Trends. PloS one, 12(1), e0165085. Toda H.Y., Yamamoto T. (1995). Statistical inference in vector autoregressions with possibly integrated processes. Journal of Econometrics, 66(1-2), 225-250. Valdivia A., Monge-Corella S. (2010). Diseases tracked by using Google trends, Spain. Emerging Infectious Diseases, 16(1), 168-169. Yang S., Santillana M., Kou S.C. (2015). Accurate estimation of influenza epidemics using Google search data via ARGO. Proceedings of the National Academy of Sciences, 112(47), 14473-14478. Yin S., Ho M. (2012). Monitoring a toxicological outbreak using Internet search query data. Clinical toxicology, 50(9), 818-822. Yuan Q., Nsoesie E.O., Lv B., Peng G., Chunara R., Brownstein J.S. (2013). Monitoring influenza epidemics in China with search query from Baidu. PLoS One, 8: e64323. Wiener N. (1956). The theory of prediction. In: Beckenbach, E. (Ed.), Modern Mathematics for Engineers. McGraw-Hill, New York, 165-190. Wilson K., Brownstein J.S. (2009). Early detection of disease outbreaks using the Internet. Canadian Medical Association Journal, 180(8), 829-831. World Health Organization (2020). WHO characterizes COVID-19 as a pandemic. Available from: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/events-as-they-happen . Zhou X., Ye J., Feng Y. (2011). Tuberculosis surveillance by analyzing Google trends. IEEE transactions on biomedical engineering, 58(8), 2247-2254. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/102315 |