Lefebvre, Germain and Nioche, Aurélien and Bourgeois-Gironde, Sacha and Palminteri, Stefano (2018): An Empirical Investigation of the Emergence of Money: Contrasting Temporal Difference and Opportunity Cost Reinforcement Learning.
Preview |
PDF
MPRA_paper_85586.pdf Download (3MB) | Preview |
Abstract
Money is a fundamental and ubiquitous institution in modern economies. However, the question of its emergence remains a central one for economists. The monetary search-theoretic approach studies the conditions under which commodity money emerges as a solution to override frictions inherent to inter-individual exchanges in a decentralized economy. Although among these conditions, agents' rationality is classically essential and a prerequisite to any theoretical monetary equilibrium, human subjects often fail to adopt optimal strategies in tasks implementing a search-theoretic paradigm when these strategies are speculative, i.e., involve the use of a costly medium of exchange to increase the probability of subsequent and successful trades. In the present work, we hypothesize that implementing such speculative behaviors relies on reinforcement learning instead of lifetime utility calculations, as supposed by classical economic theory. To test this hypothesis, we operationalized the Kiyotaki and Wright paradigm of money emergence in a multi-step exchange task and fitted behavioral data regarding human subjects performing this task with two reinforcement learning models. Each of them implements a distinct cognitive hypothesis regarding the weight of future or counterfactual rewards in current decisions. We found that both models outperformed theoretical predictions about subjects' behaviors regarding the implementation of speculative strategies and that the latter relies on the degree of the opportunity costs consideration in the learning process. Speculating about the marketability advantage of money thus seems to depend on mental simulations of counterfactual events that agents are performing in exchange situations.
Item Type: | MPRA Paper |
---|---|
Original Title: | An Empirical Investigation of the Emergence of Money: Contrasting Temporal Difference and Opportunity Cost Reinforcement Learning |
Language: | English |
Keywords: | Money, Speculative Behaviours, Reinforcement Learning |
Subjects: | C - Mathematical and Quantitative Methods > C9 - Design of Experiments > C91 - Laboratory, Individual Behavior D - Microeconomics > D8 - Information, Knowledge, and Uncertainty > D83 - Search ; Learning ; Information and Knowledge ; Communication ; Belief ; Unawareness E - Macroeconomics and Monetary Economics > E0 - General > E03 - Behavioral Macroeconomics |
Item ID: | 85586 |
Depositing User: | Dr Germain Lefebvre |
Date Deposited: | 29 Mar 2018 17:43 |
Last Modified: | 29 Sep 2019 00:51 |
References: | Menger C (1892) The Origin of Money. Econ J 2:239–55. Hicks JR (1935) A Suggestion for Simplifying the Theory of Money. Economica 2(5):1–19. Jones RA (1976) The Origin and Development of Media of Exchange. J Polit Econ 84(4):757– 776. Kiyotaki N, Wright R (1989) On Money as a Medium of Exchange. J Polit Econ 97(4):927–954. Roth AE, Erev I (1995) Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games Econ Behav 8(1):164–212. Ido Erev, Roth AE (1998) Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. Am Econ Rev 88(4):848–881. Duffy J (2001) Learning to speculate: Experiments with artificial and real agents. J Econ Dyn Control 25(3–4):295–319. Duffy J, Ochs J (1999) Emergence of Money as a Medium of Exchange: An Experimental Study. Am Econ Rev 89(4):847–877. Brown PM (1996) Experimental evidence on money as a medium of exchange. J Econ Dyn Control 20(4):583–600. Watkins CJCH (1989) Learning from Delayed Rewards. Dissertation (Cambridge University). Watkins CJCH, Dayan P (1992) Technical Note: Q-Learning. Mach Learn 8(3):279–292. Sutton RS, Barto AG (1998) Introduction to Reinforcement Learning doi:10.1.1.32.7692. Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292. Daunizeau J, Adam V, Rigoux L (2014) VBA : A Probabilistic Treatment of Nonlinear Models for Neurobiological and Behavioural Data. PLoS Comput Biol 10(1):e1003441. Palminteri S, Wyart V, Koechlin E (2017) The Importance of Falsification in Computational Cognitive Modeling. Trends Cogn Sci 21(6):425–433. Arthur B (1991) Designing Economic Agents That Act Like Human Agents: A Behavioral Approach to Bounded Rationality. Am Econ Rev 81(2):353–359. Bereby-Meyer Y, Erev I (1998) On Learning To Become a Successful Loser: A Comparison of Alternative Abstractions of Learning Processes in the Loss Domain. J Math Psychol 42(2– 3):266–286. Erev I, Bereby-Meyer Y, Roth AE (1999) The effect of adding a constant to all payoffs: Experimental investigation, and implications for reinforcement learning models. J Econ Behav Organ 39(1):111–128. Horita Y, Takezawa M, Inukai K, Kita T, Masuda N (2017) Reinforcement learning accounts for moody conditional cooperation behavior: experimental results. Sci Rep 7. doi:10.1038/srep39275. Byrne RMJ (2016) Counterfactual Thought. Annu Rev Psychol 67(1):135–157. Camille N, et al. (2004) The involvement of the orbitofrontal cortex in the experience of regret. Science (80- ) 304(5674):1167–1170. Coricelli G, et al. (2005) Regret and its avoidance: A neuroimaging study of choice behavior. Nat Neurosci 8(9):1255–1262. Pastor L, Veronesi P (2009) Learning in Financial Markets. Annu Rev Financ Econ 1(1):361– 381. Seru A, Shumway T, Stoffman N (2010) Learning by trading. Rev Financ Stud 23(2):705–739. Gervais S, Odean T (2001) Learning to be overconfident. Rev Financ Stud 14(1):1– Kaldor N (1939) Speculation and Economic Stability. Rev Econ Stud 7(1):1–27. Feiger G (1976) What is Speculation? Q J Econ 90(4):677–687. Kaustia M, Knüpfer S (2008) Do investors overweight personal experience? evidence from IPO subscriptions. J Finance 63(6):2679–2702. Choi JJ, Laibson D, Madrian BC, Metrick A (2009) Reinforcement learning and savings behavior. J Finance 64(6):2515–2534. Weber M, Welfens F (2011) The follow-on purchase and repurchase behavior of individual investors: An experimental investigation. Die Betriebswirtschaft 71(2):139–154. Strahilevitz MA, Odean T, Barber BM (2011) Once Burned, Twice Shy: How Naive Learning, Counterfactuals, and Regret Affect the Repurchase of Stocks Previously Sold. J Mark Res 48(SPL):S102–S120. Valentin V V., O’Doherty JP (2009) Overlapping Prediction Errors in Dorsal Striatum During Instrumental Learning With Juice and Money Reward in the Human Brain. J Neurophysiol 102(6):3384–3391. Kim H, Shimojo S, O’Doherty JP (2011) Overlapping responses for the expectation of juice and money rewards in human ventromedial prefrontal cortex. Cereb Cortex 21(4):769–776. Delgado MR, Labouliere CD, Phelps EA (2006) Fear of losing money? Aversive conditioning with secondary reinforcers. Soc Cogn Affect Neurosci 1(3):250–259. Delgado MR, Jou RL, Phelps EA (2011) Neural systems underlying aversive conditioning in humans with primary and secondary reinforcers. Front Neurosci (MAY). doi:10.3389/fnins.2011.00071. Sescousse G, Redoute J, Dreher J-C (2010) The Architecture of Reward Value Coding in the Human Orbitofrontal Cortex. J Neurosci 30(39):13095–13104. Daw ND, Niv Y, Dayan P (2005) Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci 8(12):1704–1711. Gläscher J, Daw N, Dayan P, O’Doherty JP (2010) States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66(4):585–595. Tolman EC (1948) Cognitive maps in rats and men. Psychol Rev 55(4):189–208. Lohrenz T, McCabe K, Camerer CF, Montague PR (2007) Neural signature of fictive learning signals in a sequential investment task. Proc Natl Acad Sci 104(22):9493–9498. Palminteri S, Khamassi M, Joffily M, Coricelli G (2015) Contextual modulation of value signals in reward and punishment learning. Nat Commun 6:8096. |
URI: | https://mpra.ub.uni-muenchen.de/id/eprint/85586 |