Ianni, Antonella (2011): Learning Strict Nash Equilibria through Reinforcement.
Download (320kB) | Preview
This paper studies the analytical properties of the reinforcement learning model proposed in Erev and Roth (1998), also termed cumulative reinforcement learning in Laslier et al (2001). This stochastic model of learning in games accounts for two main elements: the law of effect (positive reinforcement of actions that perform well) and the law of practice (the magnitude of the reinforcement effect decreases with players' experience). The main results of the paper show that, if the solution trajectories of the underlying replicator equation converge exponentially fast, then, with probability arbitrarily close to one, all the realizations of the reinforcement learning process will, from some time on, lie within an " band of that solution. The paper improves upon results currently available in the literature by showing that a reinforcement learning process that has been running for some time and is found suffciently close to a strict Nash equilibrium, will reach it with probability one.
|Item Type:||MPRA Paper|
|Original Title:||Learning Strict Nash Equilibria through Reinforcement|
|Keywords:||Strict Nash Equilibrium, Reinforcement Learning|
|Subjects:||C - Mathematical and Quantitative Methods > C9 - Design of Experiments > C92 - Laboratory, Group Behavior
D - Microeconomics > D8 - Information, Knowledge, and Uncertainty > D83 - Search; Learning; Information and Knowledge; Communication; Belief
C - Mathematical and Quantitative Methods > C7 - Game Theory and Bargaining Theory > C72 - Noncooperative Games
|Depositing User:||Antonella Ianni|
|Date Deposited:||07. Oct 2011 16:54|
|Last Modified:||16. Feb 2013 05:28|
Arthur, W.B. : (1993), \On designing economic agents that behave like human agents," Journal of Evolutionary Economics, 3, 1-22.
Arthur, W.B. Yu., M. Ermoliev and Yu. Kaniovski : (1987), \Non-linear Urn Processes: Asymptotic Behavior and Applications," mimeo, IIASA WP- 87-85.
Arthur, W.B. Yu., M. Ermoliev and Yu. Kaniovski : (1988), \Non-linear Adaptive Processes of Growth with General Increments: Attainable and Unattainable Components of Terminal Set.," mimeo, IIASA WP-88-86.
Beggs, A.W. : (2005), \On the Convergence of Reinforcement Learning.," Jour- nal of Economic Theory, 122, 1-36.
Benaim, M. : (1999), \Dynamics of Stochastic Approximation, Le Seminaire de Probabilite', Springer Lecture Notes in Mathematics.
Benaim, M and J. Weibull : (2003), \Deterministic Approximation of Stochastic Evolution in Games," Econometrica, 71, 873-903.
Benveniste, A., Metivier, M. and P. Priouret : (1990), \Adaptive Algorithms and Stochastic Approximation, . Springer-Verlag.
B�orgers, T. and R. Sarin : (1997), \Learning Through Reinforcement and Replicator Dynamics," Journal of Economic Theory, 77, 1-14.
Camerer, C. and T.H. Ho : (1999), \Experience-Weighted Attraction Learning in Normal Form Games," Econometrica, 67(4), 827-874.
Cross, J.G. : (1973), \A Stochastic Learning Model of Economic Behavior," Quaterly Journal of Economics, 87, 239-266.
Cross, J.G. : (1983), \A Theory of Adaptive Economic Behavior, . Cambridge: Cambridge University Press.
Erev, I. and A.E. Roth : (1998), \Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria," American Economic Review, 88(4), 848-881.
Fudenberg D. and D. Levine : (1998), \Theory of Learning in Games, . MIT Press.
Hopkins, E. : (2002), \Two competing models of how people learn in games," Econometrica, 70, 2141-2166.
Hopkins, E. and M. Posch : (2005), \Attainability of boundary points under reinforcement learning," Games and Economic Behavior, 53, 110-125.
Ianni, A. : (2007), \Learning Strict Nash Equilibrium through Reinforcement," mimeo, EUI Working Paper 2007/21.
Izquierdo, L.R., Izquierdo, S.S., Gotts, N.M. and J.G. Polhill : (2007), \Transient and asymptotic dynamics of reinforcement learning in games," Games and Economic Behavior, 61, 259-276.
Khalil, H.K. : (1996), \Nonlinear Systems, . Prentice Hall.
Laslier, J.F., Topol R. and B. Walliser : (2001), \A Behavioral Learning Process in Games," Games and Economic Behavior, 37, 340-366.
Ljung, L. : (1978), \Strong Convergence of a Stochastic Approximation Algorithm," Annals of Statistics, 6, 680-696.
Posh, M. : (1997), Cycling in a stochastic learning algorithm for normal form games," Journal of Evolutionary Dynamics, 7, 193-207.
Ritzberger K. and J. Weibull : (1995), \Evolutionary Selection in normal form games," Econometrica, 63, 1371-1399.
Roth, A. and I. Erev : (1995), \Learning in Extensive Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term," Games and Economic Behavior, 8(1), 164-212.
Taylor, P. : (1979), \Evolutionary stable strategies with two types of player," Journal of Applied Probability, 16, 76-83.
Vega-Redondo, F. : (2003), \Economics and the Theory of Games, . Cambridge University Press.
Weibull J. : (1995), \Evolutionary Game Theory, . MIT Press.