Chatelain, Jean-Bernard (2010): Can statistics do without artefacts? Published in: Prisme No. 19 (December 2010): pp. 1-39.
Download (637Kb) | Preview
This companion paper to Chatelain and Ralf (2012), “Spurious regressions with near-multicollinearity” put their results into the contexts of the history of statistics, of the current publication bias in applied sciences and of the substantive versus statistical significance debate. This article presents a particular case of spurious regression, when a dependent variable has a coefficient of simple correlation close to zero with two other variables, which are, on the contrary, highly correlated with each other. In these spurious regressions, the parameters measuring the size of the effect on the dependent variable are very large. They can be “statistically significant”. The tendency of scientific journals to favour the publication of statistically significant results is one reason why spurious regressions are so numerous, especially since it is easy to build them with variables that are lagged, squared or interacting with another variable. Such regressions can enhance the reputation of researchers by stimulating the appearance of strong effects between variables. These often surprising effects are not robust and often depend on a limited number of observations, fuelling scientific controversies. The resulting meta-analyses, based on statistical synthesis of the literature evaluating this effect between two variables, confirm the absence of any effect. This article provides an example of this phenomenon in the empirical literature, with the aim of evaluating the impact of development aid on economic growth.
|Item Type:||MPRA Paper|
|Original Title:||Can statistics do without artefacts?|
|Keywords:||Spurious regressions, statistical significance, near-multicollinearity, classical suppressors, growth, development aid|
|Subjects:||O - Economic Development, Technological Change, and Growth > O4 - Economic Growth and Aggregate Productivity > O47 - Measurement of Economic Growth; Aggregate Productivity; Cross-Country Output Convergence
F - International Economics > F3 - International Finance > F35 - Foreign Aid
C - Mathematical and Quantitative Methods > C5 - Econometric Modeling > C52 - Model Evaluation, Validation, and Selection
C - Mathematical and Quantitative Methods > C1 - Econometric and Statistical Methods and Methodology: General > C12 - Hypothesis Testing: General
P - Economic Systems > P4 - Other Economic Systems > P45 - International Trade, Finance, Investment, and Aid
B - History of Economic Thought, Methodology, and Heterodox Approaches > B1 - History of Economic Thought through 1925 > B16 - Quantitative and Mathematical
B - History of Economic Thought, Methodology, and Heterodox Approaches > B4 - Economic Methodology > B41 - Economic Methodology
|Depositing User:||Jean-Bernard Chatelain|
|Date Deposited:||28. Nov 2012 13:21|
|Last Modified:||15. Feb 2013 08:15|
Aldrich, J. (1995), “Correlations Genuine and Spurious in Pearson and Yule”, Statistical Science, 10(4), pp. 364–76.
Ahn, S.C., C. Gadarowski and M.F. Perez, 2009, “Effects of Beta Distribution and Persistent Factors on the Two Pass Cross-Sectional Regression”, Working paper, Arizona State University.
Armatte, M. (2001), “Le statut changeant de la corrélation en économétrie (1910-1940)”, Revue Economique, 52(3), pp. 617–31.
Bernard, C.  (1993), Introduction à l’étude de la médecine expérimentale, Paris: Champs, Flammarion.
Burnside C. and D. Dollar (2000), “Aid, Policies and Growth”, American Economic Review, 90(4), pp.847–68.
Bühlmann P., M. Kalisch and M.H. Maathuis (2010), “Variable Selection in High-Dimensional Models: Partially Faithful Distributions and the PC-Simple Algorithm” Biometrika, 97, pp. 261–78.
Chatelain, J.B. and K. Ralf (2012), “Spurious Regressions and Near-Multicollinearity, with an Application to Aid, Policies and Growth”, working paper.
Cohen, J. (1994), “The earth is round (p<.05)”, American Psychologist, 49(12), pp.997–1003.
Denton, F.T. (1985), “Data Mining as an Industry”, The Review of Economics and Statistics, 67(1), pp.124–27.
Doucouliagos, H. and M. Paldam (2009), “The Aid Effectiveness Literature: The Sad Results of 40 Years of Research”, Journal of Economic Surveys, 23(3), pp. 433–61.
Doucouliagos, H. and M. Paldam (2010), “Conditional Aid Effectiveness: A Meta Study”, Journal of International Development, 22(4), pp. 391–410.
Easterly, W., R. Levine and D. Roodman (2004), “New Data, New Doubts: A Comment on Burnside and Dollar’s ‘Aid, Policies, and Growth’ (2000)”, American Economic Review 94(3), pp.774–80.
Fama, E.F., and K.R. French (1993), “Common Risk Factors in the Returns on Stocks and Bonds”, Journal of Financial Economics, 33, pp. 3–56.
Fisher, R. (1925), “Applications of ‘Student’s’ distribution”, Metron, 5(3), pp. 90–104.
Frisch, R. (1934), “Statistical Confluence Analysis by Means of Complete Regressions Systems”, Publication no 5, University Institute of Economics, Oslo.
Freedman D. (1997), “From Association to Causation via Regression” in V.R. McKim and S.P. Turner (eds), Causality in Crisis?, Notre Dame, IN: University of Notre Dame Press, pp. 113–61.
Galton, F. (1886), “Regression Towards Mediocrity in Hereditary Stature”, The Journal of the Anthropological Institute of Great Britain and Ireland, 15, pp. 246–63.
Hendry, D.F. and M.S. Morgan (1989), “A Re-Analysis of Confluence Analysis”, Oxford Economic Papers, 41, pp. 35–52.
Hoover, K.D. (2001), Causality in Macroeconomics, Cambridge, UK: Cambridge University Press.
Hume, D.  (2000), A Treatise of Human Nature: Being an Attempt to Introduce the Experimental Method of Reasoning into Moral Subjects, republished in D.F. Norton and M.J. Norton (eds), A Treatise of Human Nature, Oxford: Oxford University Press.
Ioannidis, J.P.A. (2008), “Why Most Discovered True Associations Are Inflated”, Epidemiology, 19(5), pp. 640–48.
Jagannathan, R. and Z. Wang (1996), “The Conditional CAPM and the Cross-Section of Expected Return”, Journal of Finance, 51, pp. 3–53.
Johansen S. (2005), “Interpretation of Cointegrating Coefficients in the Cointegrated Vector Autoregressive Model”, Oxford Bulletin of Economics and Statistics, 67(1), pp. 93–104.
Keuzenkamp H. (2000), Probability, Econometrics and Truth: The Methodology of Econometrics. Cambridge: Cambridge University Press.
Legendre, A.M. (1805), Nouvelles méthodes pour la détermination des orbites des comètes, Paris: Courcier.
Magnus J.R. and M.S. Morgan (1999), Methodology and Tacit Knowledge: Two Experiments in Econometrics, Chichester: John Wiley & Sons Ltd.
McCloskey, D. and S.T. Ziliak (2008), The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice and Lives, Ann Arbor, MI: University of Michigan Press.
Milton J.R. (1987), “Induction before Hume”, The British Journal for the Philosophy of Science, 38 (3), pp. 49–74.
Moore, H.L. (1905), “The Personality of Antoine Augustin Cournot”, The Quarterly Journal of Economics, 19 (3), pp. 370–399.
Moore, H.L. (1917), Forecasting the Yield and the Price of Cotton, New York: Macmillan.
Neyman, J. and E.S. Pearson (1933), “On the Problem of the Most Efficient Tests of Statistical Hypotheses”, Philosophical Transactions of the Royal Society London (A), 231, pp. 289–337.
Pearl, J. (2009), Causality: Models, Reasoning and Inference (2nd edition), Cambridge: Cambridge University Press.
Pearson, K. (1897), “On a Form of Spurious Correlation that May Arise when Indices Are Used in the Measurement of Organs”, Proceedings of the Royal Society London Series. A, 60, pp. 489–98.
Pearson, E.S. (1939), “’Student’ as Statistician”, Biometrika, 30(3/4), pp. 210–50.
Petersen, M.A. (2009), “Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches”, Review of Financial Studies, 22(1), pp. 435–80.
Sextus Empiricus [v. 200] (1997), [Πυῤῥώνειοι ὑποτύπωσεις], Esquisses pyrrhoniennes, translated by P. Pellegrin, Paris: Le Seuil.
Simon, H. (1954), “Spurious Correlation: A Causal Interpretation”, Journal of the American Statistical Association, 49, pp. 467–92.
Spirtes, P., C.N. Glymour and R. Scheines (2000), Causation, Prediction, and Search, 2nd edition, Cambridge, UK: Cambridge University Press.
Stanley, T.D. (2005), “Beyond Publication Bias”, Journal of Economic Surveys, 19, pp. 309–45.
Student (1908), “The Probable Error of a Mean”, Biometrika, 6, pp. 1–25.
Tinbergen, J. (1939), Statistical Testing of Business Cycle Theories: A Method and Its Application to Investment Activity, 1, Geneva: League of Nations.
Tobin, J. (1950), “A Statistical Demand Function for Food in the USA”, Journal of the Royal Statistical Society, Series A, 113, Part II, pp. 113–49.
Tobin, J. (1999), “My 1950 Food Demand Study in Retrospect”, in Magnus J.R. and M.S. Morgan (eds), Methodology and Tacit Knowledge, Chichester: John Wiley & Sons Ltd, pp. 265–68.
Wiener, N. (1948), Cybernetics, or Control and Communication in the Animal and the Machine, Cambridge, MA: MIT Press.
Wright, S. (1920), “The Relative Importance of Heredity and Environment in Determining the Piebald Pattern of Guinea-Pigs”, Proceedings of the National Academy of Sciences, 6, pp. 320–32.
Yule, G.U. (1897), “On the Theory of Correlation”, Journal of the Royal Statistical Society, 60, pp.812–54.
Ziliak, S.T. (2008), “Guinessometrics: The Economic Foundation of ‘Student’s’ t”, Journal of Economic Perspectives, 22(4), pp. 199–216.