Tierney, Heather L.R. and Pan, Bing (2010): A Poisson Regression Examination of the Relationship between Website Traffic and Search Engine Queries.
Download (420Kb) | Preview
A new area of research involves the use of normalized and scaled Google search volume data to predict economic activity. This new source of data holds both many advantages as well as disadvantages. Daily and weekly data are employed to show the effect of aggregation in Google data, which can lead to contradictory findings. In this paper, Poisson regressions are used to explore the relationship between the online traffic to a specific website and the search volumes for certain search queries, along with the rankings of that website for those queries. The purpose of this paper is to point out the benefits and the pitfalls of a potential new source of data that lacks transparency in regards to the raw data, which is due to the normalization and scaling procedures utilized by Google.
|Item Type:||MPRA Paper|
|Original Title:||A Poisson Regression Examination of the Relationship between Website Traffic and Search Engine Queries|
|Keywords:||Poisson Regression, Search Engine, Google Insights, Aggregation, Normalization Effects, Scaling Effects|
|Subjects:||C - Mathematical and Quantitative Methods > C4 - Econometric and Statistical Methods: Special Topics > C43 - Index Numbers and Aggregation
D - Microeconomics > D8 - Information, Knowledge, and Uncertainty > D83 - Search; Learning; Information and Knowledge; Communication; Belief
C - Mathematical and Quantitative Methods > C2 - Single Equation Models; Single Variables > C25 - Discrete Regression and Qualitative Choice Models; Discrete Regressors; Proportions
|Depositing User:||Heather L.R. Tierney|
|Date Deposited:||08. Jul 2011 23:32|
|Last Modified:||12. Feb 2013 20:36|
Askitas, N. and Zimmerman, K.F. (2009), “Google Econometrics and Unemployment Forecasting,” Applied Economics Quarterly, 55:2, 107-120.
Azar, J. (2009), “Oil Prices and Electric Cars”, Princeton University Working Paper.
Barbaro, M. and Zeller, T. (2006) “A Face Is Exposed for AOL Searcher No. 4417749”,New York Times, August 9, accessed online at http://www.nytimes.com.
Bian, L. (1997), “Multiscale Nature of Spatial Data in Scaling Up Environmental Models,” Scale in Remote Sensing and GIS, D.A. Quattrochi and M.F. Goodchild, eds., London: CRC Press.
Cameron, A.C. and Trivedi, P.K (1986), “Econometric Models Based on Count Data: Comparisons and Applications of Some Estimators and Tests,” Journal of Applied Econometrics, 1:1, 29-53.
Cameron, A.C. and Trivedi, P.K. (1998), Regression Analysis of Count Data, Cambridge: Cambridge University Press.
Cameron, A.C. and Windmeijer, F. (1996), "R-Squared Measures for Count Data Regression Models with Applications to Health-Care Utilization," Journal of Business and Economic Statistics, 14:2, 209-220.
Choi, H. and Varian, H. (2009a), “Predicting the Present with Google Trends,” Google Technical Report.
Choi, H. and Varian, H. (2009b), “Predicting Initial Claims for Unemployment Benefits,” Google Technical Report.
Engle, R. F. and Granger, C. W. J. (1987), “Co-Integration and Error Correction: Representation, Estimation, and Testing,” Econometrica, 55:2, 251-276.
Gagnon, J.E. (2008), “Inflation Regimes and Inflation Expectations,” Federal Reserve Bank of St. Louis Review, 90:3-Part 2.
Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., and Brilliant, L. (2009), “Detecting Influenza Epidemics using Search Engine Query Data,” Nature, 457, 1012 –1014.
-----Google (2009), “About Google Trends,” http://www.google.com/intl/en/trends/about.html (accessed August 30, 2009).
-----Google Insights (2009a), “How is the data normalized?” http://www.google.com/support/insights/bin/bin/bin/answer.py?answer=87284&topic=13975 (accessed August 30, 2009).
-----Google Insights (2009b), “How is the data scaled?” http://www.google.com/support/insights/bin/bin/answer.py?answer=87282 (accessed August 30, 2009).
McCullagh, P., Nelder, J. A. (1989), “Generalized Linear Models,” Second Edition, New York: Chapman and Hall.
Pan, B., Litvin, S.W. and O'Donnell, T. (2007), “Accommodation Search Query Formulation: Implications for Search Engine Marketing,” Journal of Vacation Marketing, 13:4, 371-381.
Pyle, D. (1999), “Data Preparation for Data Mining (The Morgan Kaufmann Series in Data Management Systems),” San Francisco: Morgan Kaufmann.
Rapach, D. (2003), “International Evidence on the Long-Run Impact of Inflation,” Journal of Money Credit and Banking, 35:1, 23-45.
Rossana, R.J. and Seater, J.J. (1995), “Temporal Aggregation and Economic Time Series,” American Statistical Association, 13:4, 441-451.
Marvasti, M.A. (2010), “Quantifying Information Loss through Data Aggregation,” VMware Technical White Paper, 1-14.
Michener, R. and Tighe, C. (1992), “A Poisson Regression Model of Highway Fatalities,” The American Economic Review, 82:2, 452-456.
Wooldridge, J.M. (1997), “Quasi-Likelihood Methods for Count Data,” Handbook of Applied Econometrics, Volume II: Microeconomics, M.H. Pesaran and P. Schmidt, eds., Oxford: Blackwell, 352-406.
Wooldridge, J.M. (2002), “Econometric Analysis of Cross Section and Panel Data,” Cambridge: MIT Press.