Learning Strict Nash Equilibria through Reinforcement

Ianni, Antonella (2011): Learning Strict Nash Equilibria through Reinforcement.

Preview

PDF
MPRA_paper_33936.pdf
Download (320kB) | Preview

Abstract

This paper studies the analytical properties of the reinforcement learning model proposed in Erev and Roth (1998), also termed cumulative reinforcement learning in Laslier et al (2001). This stochastic model of learning in games accounts for two main elements: the law of effect (positive reinforcement of actions that perform well) and the law of practice (the magnitude of the reinforcement effect decreases with players' experience). The main results of the paper show that, if the solution trajectories of the underlying replicator equation converge exponentially fast, then, with probability arbitrarily close to one, all the realizations of the reinforcement learning process will, from some time on, lie within an " band of that solution. The paper improves upon results currently available in the literature by showing that a reinforcement learning process that has been running for some time and is found suffciently close to a strict Nash equilibrium, will reach it with probability one.

Item Type:	MPRA Paper
Original Title:	Learning Strict Nash Equilibria through Reinforcement
Language:	English
Keywords:	Strict Nash Equilibrium, Reinforcement Learning
Subjects:	C - Mathematical and Quantitative Methods > C9 - Design of Experiments > C92 - Laboratory, Group Behavior D - Microeconomics > D8 - Information, Knowledge, and Uncertainty > D83 - Search ; Learning ; Information and Knowledge ; Communication ; Belief ; Unawareness C - Mathematical and Quantitative Methods > C7 - Game Theory and Bargaining Theory > C72 - Noncooperative Games
Item ID:	33936
Depositing User:	Antonella Ianni
Date Deposited:	07 Oct 2011 16:54
Last Modified:	10 Oct 2019 11:40
References:	Arthur, W.B. : (1993), \On designing economic agents that behave like human agents," Journal of Evolutionary Economics, 3, 1-22. Arthur, W.B. Yu., M. Ermoliev and Yu. Kaniovski : (1987), \Non-linear Urn Processes: Asymptotic Behavior and Applications," mimeo, IIASA WP- 87-85. Arthur, W.B. Yu., M. Ermoliev and Yu. Kaniovski : (1988), \Non-linear Adaptive Processes of Growth with General Increments: Attainable and Unattainable Components of Terminal Set.," mimeo, IIASA WP-88-86. Beggs, A.W. : (2005), \On the Convergence of Reinforcement Learning.," Jour- nal of Economic Theory, 122, 1-36. Benaim, M. : (1999), \Dynamics of Stochastic Approximation, Le Seminaire de Probabilite', Springer Lecture Notes in Mathematics. Benaim, M and J. Weibull : (2003), \Deterministic Approximation of Stochastic Evolution in Games," Econometrica, 71, 873-903. Benveniste, A., Metivier, M. and P. Priouret : (1990), \Adaptive Algorithms and Stochastic Approximation, . Springer-Verlag. B�orgers, T. and R. Sarin : (1997), \Learning Through Reinforcement and Replicator Dynamics," Journal of Economic Theory, 77, 1-14. Camerer, C. and T.H. Ho : (1999), \Experience-Weighted Attraction Learning in Normal Form Games," Econometrica, 67(4), 827-874. Cross, J.G. : (1973), \A Stochastic Learning Model of Economic Behavior," Quaterly Journal of Economics, 87, 239-266. Cross, J.G. : (1983), \A Theory of Adaptive Economic Behavior, . Cambridge: Cambridge University Press. Erev, I. and A.E. Roth : (1998), \Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria," American Economic Review, 88(4), 848-881. Fudenberg D. and D. Levine : (1998), \Theory of Learning in Games, . MIT Press. Hopkins, E. : (2002), \Two competing models of how people learn in games," Econometrica, 70, 2141-2166. Hopkins, E. and M. Posch : (2005), \Attainability of boundary points under reinforcement learning," Games and Economic Behavior, 53, 110-125. Ianni, A. : (2007), \Learning Strict Nash Equilibrium through Reinforcement," mimeo, EUI Working Paper 2007/21. Izquierdo, L.R., Izquierdo, S.S., Gotts, N.M. and J.G. Polhill : (2007), \Transient and asymptotic dynamics of reinforcement learning in games," Games and Economic Behavior, 61, 259-276. Khalil, H.K. : (1996), \Nonlinear Systems, . Prentice Hall. Laslier, J.F., Topol R. and B. Walliser : (2001), \A Behavioral Learning Process in Games," Games and Economic Behavior, 37, 340-366. Ljung, L. : (1978), \Strong Convergence of a Stochastic Approximation Algorithm," Annals of Statistics, 6, 680-696. Posh, M. : (1997), Cycling in a stochastic learning algorithm for normal form games," Journal of Evolutionary Dynamics, 7, 193-207. Ritzberger K. and J. Weibull : (1995), \Evolutionary Selection in normal form games," Econometrica, 63, 1371-1399. Roth, A. and I. Erev : (1995), \Learning in Extensive Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term," Games and Economic Behavior, 8(1), 164-212. Taylor, P. : (1979), \Evolutionary stable strategies with two types of player," Journal of Applied Probability, 16, 76-83. Vega-Redondo, F. : (2003), \Economics and the Theory of Games, . Cambridge University Press. Weibull J. : (1995), \Evolutionary Game Theory, . MIT Press.
URI:	https://mpra.ub.uni-muenchen.de/id/eprint/33936

All papers reproduced by permission. Reproduction and distribution subject to the approval of the copyright owners.

View Item

Atom RSS 1.0 RSS 2.0

Contact us: mpra@ub.uni-muenchen.de

This repository has been built using EPrints software.

MPRA is a RePEc service hosted by .