Manheim, David (2023): Building less-flawed metrics: Understanding and creating better measurement and incentive systems. Published in: Patterns , Vol. 4, No. 10 (13 October 2023)
This is the latest version of this item.
Preview |
PDF
MPRA_paper_118443.pdf Download (752kB) | Preview |
Abstract
Metrics are useful for measuring systems and motivating behaviors in academia as well as in public policy, medicine, business, and other systems. Unfortunately, naive application of metrics to a system can distort the system and even undermine the original goal. There are two interrelated problems to overcome in building better metrics in academia and elsewhere. The first, specifying evaluable metrics that correspond to the goals, is well recognized but still often ignored. The second, minimizing perverse effects that undermine the metric or that enable people to game the rewards, is less recognized but is critical. This perspective discusses designing metrics, beginning with design considerations and processes; the presentation of specific strategies for mitigating perverse impacts, including secrecy, randomization, diversification, and post hoc specification; and continuing with important desiderata and tradeoffs involved with examples of how they can complement each other or differ. Finally, this perspective presents a comprehensive process integrating these ideas.
Item Type: | MPRA Paper |
---|---|
Original Title: | Building less-flawed metrics: Understanding and creating better measurement and incentive systems |
Language: | English |
Keywords: | Metrics, Measurement, Complex Systems, Control Theory, Perverse Incentives, Cobra Effect, Goodhart's Law, Campbell's Law |
Subjects: | D - Microeconomics > D8 - Information, Knowledge, and Uncertainty D - Microeconomics > D8 - Information, Knowledge, and Uncertainty > D80 - General I - Health, Education, and Welfare > I2 - Education and Research Institutions > I26 - Returns to Education I - Health, Education, and Welfare > I2 - Education and Research Institutions > I28 - Government Policy J - Labor and Demographic Economics > J4 - Particular Labor Markets > J48 - Public Policy Z - Other Special Topics > Z1 - Cultural Economics ; Economic Sociology ; Economic Anthropology > Z18 - Public Policy |
Item ID: | 118443 |
Depositing User: | David Manheim |
Date Deposited: | 06 Nov 2023 19:05 |
Last Modified: | 06 Nov 2023 19:05 |
References: | 1. Campbell, D.T. (1979). Assessing the impact of planned social change. Eval. Progr. Plann. 2, 67–90. https://doi.org/10.1016/0149-7189(79)90048-X. 2. Goodhart, C.A.E. (1975). Problems of monetary management: the UK experience. In Papers in Monetary Economics (Reserve Bank of Australia). 3. Hoskin, K. (1996). The ‘awful idea of accountability’: inscribing people into the measurement of objects. Accountability: Power, ethos and the technologies of managing 265. 4. Rodamar, J. (2018). There ought to be a law! campbell versus goodhart. Significance 15, 9. https://doi.org/10.1111/j.1740-9713.2018.01205.x. 5. Muller, J.Z. (2018). The Tyranny of Metrics (Princeton University Press). 6. Strathern, M. (1997). Improving ratings: audit in the British university system. Eur. Rev. 5, 305–321. https://doi.org/10.1002/(SICI)1234-981X(199707)5:3<305::AID-EURO184>3.0.CO;2-4. 7. Thomas, R.L., and Uminsky, D. (2022). Reliance on metrics is a fundamental challenge for AI. Patterns 3, 100476. https://doi.org/10.1016/j.patter.2022.100476. 8. Manheim, D., and Garrabrant, S. (2018). Categorizing Variants of Goodhart’s Law. Preprint at arXiv. https://doi.org/10.48550/arXiv.1803.04585. 9. Chivers, T. (2020). Don’t put too much faith in covid-19 metrics. UnHerd 14. 10. Tonjes, D.J., Thyberg, K.L., and Hewitt, E. (2021). Better public decisions on covid-19: A thought experiment in metrics. Public Health Pract. 2,100208. https://doi.org/10.1016/j.puhip.2021.100208. 11. Caplan, B. (2018). The Case against Education. Why the Education System Is a Waste of Time and Money (Princeton University Press). 12. McCann, L. (2017). ‘killing is our business and business is good’: The evolution of ‘war managerialism’from body counts to counterinsurgency. Organization 24, 491–515. https://doi.org/10.1177/1350508417693852. 13. Herzberg, F. (1968). One more time: How do you motivate employees. Harv. Bus. Rev. 14. Manley, A., and Williams, S. (2019). ‘we’re not run on numbers, we’re people, we’re emotional people’: Exploring the experiences and lived consequences of emerging technologies, organizational surveillance and control among elite professionals. Organization 29, 692–713. https://doi.org/10.1177/1350508419890078. 15. Hubbard, D.W. (2007). How to Measure Anything: Finding the Value of Intangibles in Business, second edition (Wiley). https://doi.org/10.1002/9781118983836. 16. Schoeller, D.A. (1990). How accurate is self-reported dietary energy intake? Nutr. Rev. 48, 373–379. https://doi.org/10.1111/j.1753-4887.1990.tb02882.x. 17. Hirsch, J.E. (2005). An index to quantify an individual’s scientific research output. Proc. Natl. Acad. Sci. USA 102, 16569–16572. 18. Li, X., Wang, H., Yang, Y., Qi, C., Wang, F., and Jin, M. (2015). Effect of height on motor coordination in college students participating in a dancesport program. Med. Probl. Perform. Ar. 30, 20–25. https://doi.org/10.21091/mppa.2015.1003. 19. Borsboom, D., Mellenbergh, G.J., and van Heerden, J. (2004). The concept of validity. Psychol. Rev. 111, 1061–1071. 20. Adam, A., Wanick, V., and Gary, W. (2017). Metrics Feedback Cycle: measuring and improving user engagement in gamified eLearning systems. International Journal of Serious Games 4, 3–19. 21. Deresiewicz, W. (2015). Excellent Sheep: The Miseducation of the American Elite and the Way to a Meaningful Life (Free Press). 22. Schneider, J., and Gottlieb, D. (2021). In praise of ordinary measures: The present limits and future possibilities of educational accountability. Educ. Theor. 71, 455–473. https://doi.org/10.1111/edth.12488. 23. Simon, H.A. (1956). Rational choice and the structure of the environment. Psychol. Rev. 63, 129–138. 24. Lempert, R.J., Groves, D.G., Popper, S.W., and Bankes, S.C. (2006). A General, Analytic Method for Generating Robust Strategies and Narrative Scenarios. Manag. Sci. 52, 514–528. https://doi.org/10.1596/1813-9450-6906. 25. Kalra, N., Hallegatte, S., Lempert, R., Brown, C., Fozzard, A., Gill, S., and Shah, A. (2014). Agreeing on Robust Decisions New Processes for Decision Making under Deep Uncertainty (World Bank Policy Research Working Paper, No. 6906). 26. Caudill, H.L., and Porter, C.D. (2014). An Historical Perspective of Reward Systems: Lessons Learned from the Scientific Management Era. ijhrs. 4, 127. https://doi.org/10.5296/ijhrs.v4i4.6605. 27. Saltelli, A. (2020). Ethics of quantification or quantification of ethics? Futures 116, 102509. 28. Fraade-Blanar, L., Blumenthal, M.S., Anderson, J.M., and Kalra, N. (2018). Measuring Automated Vehicle Safety (RAND Corporation). Technical report. 29. Soares, N. (2015). Half-assing it with Everything You’ve Got. 30. Choi, J.W., Hecht, G.W., and Tayler, W.B. (2012). Lost in translation: The effects of incentive compensation on strategy surrogation. Account. Rev. 87, 1135–1163. 31. Dana, H.T., and Clark, H. (2012). Theory of Change Basics: A Primer on Theory of Change. 32. Rosenhead, J., and Mingers, J. (2001). Rational Analysis for a Problematic World Revisited, Number 2nd (John Wiley and Sons). 33. Mitchell, D.J., Edward Russo, J., and Pennington, N. (1989). Back to the future: Temporal perspective in the explanation of events. J. Behav. Decis. Making 2, 25–38. 34. Klein, G. (2007). Performing a project premortem. Harv. Bus. Rev. 85, 18–19. 35. Klein, G., Sonkin, P.D., and Johnson, P. (2019). Rendering a Powerful Tool Flaccid: The Misuse of Premortems on Wall Street. 36. Flacker, J.M., and Kiely, D.K. (2003). Mortality-related factors and 1-year survival in nursing home residents. J. Am. Geriatr. Soc. 51, 213–221. 37. Cullen, O’Keefe, Peter, C., Garfinkel, B., Flynn, C., Leung, J., and Allan, D. (2019). The Windfall Clause: Distributing the Benefits of AI for the Common Good. Preprint at arXiv. https://doi.org/10.48550/arXiv.1912.11595. 38. Sturla, K., Shah, B., and McManus, J. (2018). The Great DIB-Ate (Measurement for Development Impact Bonds). 39. (2007). The Hirsch Index. W. Glanzel, ed. € 1, 179–256. 40. Van Leeuwen, T.N., Visser, M.S., Moed, H.F., Nederhof, T.J., and Van Raan, A.F.J. (2003). The Holy Grail of science policy: Exploring and combining bibliometric tools in search of scientific excellence. Scientometrics 57, 257–280. https://doi.org/10.1023/A:1024141819302. 41. Patton, M.Q. (1980). Qualitative Evaluation Methods (Sage publications Beverly). 42. Hess., F. (2018). Straight up Conversation: Scholar Jay Greene on the Importance of Field Trips. Education Week. https://www.edweek.org/education/opinion-straight-up-conversation-scholar-jay-greene-on-theimportance-of-field-trips/2018/09. 43. Liebowitz, S., and Kelly, M.L. (2018). Everything You Know about State Education Rankings Is Wrong: Minds and Dollars Are a Terrible Thing to Waste (Reason). 44. Simon, H.A. (1947). Administrative Behavior; a Study of Decision-Making Processes in Administrative Organization (Macmillan). 45. Shorrock, S. (2019). Shorrock’s Law of Limits. https://humanisticsystems.com/2019/10/24/shorrocks-law-of-limits/. 46. Duff, F.J., Mengoni, S.E., Bailey, A.M., and Snowling, M.J. (2015). Validity and sensitivity of the phonics screening check: implications for practice. J. Res. Read. 38, 109–123. https://doi.org/10.1111/1467-9817.12029. 47. Bradbury, A. (2014). ‘Slimmed down’ assessment or increased accountability? Teachers, elections and UK government assessment policy. Oxf. Rev. Educ. 40, 610–627. https://doi.org/10.1080/03054985.2014.963038. 48. Cames, M., Harthan, R.O., Fussler, J., Lazarus, M., Lee, C., Erickson, P., and Spalding-Fecher, R. (2016). How additional is the clean development mechanism: Analysis of application of current tools and proposed alternative. Oeko-Institut EV CLlMA. B 3. Report number CLlMA.B.3/SERl2013/0026r. 49. Poulis, K., and Poulis, E. (2016). Problematizing fit and survival: transforming the law of requisite variety through complexity misalignment. Acad. Manage. Rev. 41, 503–527. https://doi.org/10.5465/amr.2014.0073. 50. Rasul, I., Rogger, D., and Williams, M. (2017). Management and Bureaucratic Effectiveness: A Scientific Replication (International Growth Centre). Technical report. 51. Rasul, I., and Rogger, D. (2017). Management of bureaucrats and public service delivery: Evidence from the nigerian civil service. Econ. J. 128, 413–446. https://doi.org/10.1111/ecoj.12418. 52. Rasul, I., Rogger, D., Martin, J., and Williams. (2018). Autonomy, Incentives, and the Effectiveness of Bureaucrats (VoxDev). 53. APA (American Psychiatric Association) (2013). Diagnostic and statistical manual of mental disorders. BMC Med. 17, 133–137. 54. Frances, A. (2017). Trump Isn’t Crazy (Psychology Today). https://www.psychologytoday.com/intl/blog/saving-normal/201701/trump-isnt-crazy. 55. Berry, L.M., and Houston, J.P. (1993). Psychology at Work: An Introduction to Industrial and Organizational Psychology (Brown & Benchmark/ Wm. C. Brown Publ). 56. Rogers, P.J., Petrosino, A., Huebner, T.A., and Hacsi, T.A. (2000). Program theory evaluation: Practice, promise, and problems. N. Dir. Eval. 2000, 5–13. 57. Gelman, A. (2011). Causality and Statistical Learning. Am. J. Sociol. 117, 955–966. https://doi.org/10.1086/662659. 58. van Gelder, T., Vodicka, R., and Armstrong, N. (2016). Expert Elicitation with Structured Visual Deliberation. Asia Pac. Policy Stud. 3, 378–388. 59. Kenny, G. (2014). Five Questions to Identify Key Stakeholders (HBR Harvard Business Review). 60. Ruch, W.A. (1994). Measuring and managing individual productivity. Organizational linkages: Understanding the productivity paradox 105–130. 61. Manheim, D. (2018). Value Of Information For Policy Analysis. PhD Thesis (Pardee RAND). 62. Wigert, B., and Harter, J. (2017). Re-engineering Performance Management6 (Gallup.com). |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/118443 |
Available Versions of this Item
-
Building Less Flawed Metrics. (deposited 21 Dec 2018 14:40)
-
Building Less Flawed Metrics: Dodging Goodhart and Campbell's Laws. (deposited 25 Jan 2020 02:21)
- Building less-flawed metrics: Understanding and creating better measurement and incentive systems. (deposited 06 Nov 2023 19:05) [Currently Displayed]
-
Building Less Flawed Metrics: Dodging Goodhart and Campbell's Laws. (deposited 25 Jan 2020 02:21)