Ianni, Antonella (2011): Learning Strict Nash Equilibria through Reinforcement.

PDF
MPRA_paper_33936.pdf Download (320kB)  Preview 
Abstract
This paper studies the analytical properties of the reinforcement learning model proposed in Erev and Roth (1998), also termed cumulative reinforcement learning in Laslier et al (2001). This stochastic model of learning in games accounts for two main elements: the law of effect (positive reinforcement of actions that perform well) and the law of practice (the magnitude of the reinforcement effect decreases with players' experience). The main results of the paper show that, if the solution trajectories of the underlying replicator equation converge exponentially fast, then, with probability arbitrarily close to one, all the realizations of the reinforcement learning process will, from some time on, lie within an " band of that solution. The paper improves upon results currently available in the literature by showing that a reinforcement learning process that has been running for some time and is found suffciently close to a strict Nash equilibrium, will reach it with probability one.
Item Type:  MPRA Paper 

Original Title:  Learning Strict Nash Equilibria through Reinforcement 
Language:  English 
Keywords:  Strict Nash Equilibrium, Reinforcement Learning 
Subjects:  C  Mathematical and Quantitative Methods > C9  Design of Experiments > C92  Laboratory, Group Behavior D  Microeconomics > D8  Information, Knowledge, and Uncertainty > D83  Search ; Learning ; Information and Knowledge ; Communication ; Belief ; Unawareness C  Mathematical and Quantitative Methods > C7  Game Theory and Bargaining Theory > C72  Noncooperative Games 
Item ID:  33936 
Depositing User:  Antonella Ianni 
Date Deposited:  07. Oct 2011 16:54 
Last Modified:  25. Oct 2015 20:03 
References:  Arthur, W.B. : (1993), \On designing economic agents that behave like human agents," Journal of Evolutionary Economics, 3, 122. Arthur, W.B. Yu., M. Ermoliev and Yu. Kaniovski : (1987), \Nonlinear Urn Processes: Asymptotic Behavior and Applications," mimeo, IIASA WP 8785. Arthur, W.B. Yu., M. Ermoliev and Yu. Kaniovski : (1988), \Nonlinear Adaptive Processes of Growth with General Increments: Attainable and Unattainable Components of Terminal Set.," mimeo, IIASA WP8886. Beggs, A.W. : (2005), \On the Convergence of Reinforcement Learning.," Jour nal of Economic Theory, 122, 136. Benaim, M. : (1999), \Dynamics of Stochastic Approximation, Le Seminaire de Probabilite', Springer Lecture Notes in Mathematics. Benaim, M and J. Weibull : (2003), \Deterministic Approximation of Stochastic Evolution in Games," Econometrica, 71, 873903. Benveniste, A., Metivier, M. and P. Priouret : (1990), \Adaptive Algorithms and Stochastic Approximation, . SpringerVerlag. B�orgers, T. and R. Sarin : (1997), \Learning Through Reinforcement and Replicator Dynamics," Journal of Economic Theory, 77, 114. Camerer, C. and T.H. Ho : (1999), \ExperienceWeighted Attraction Learning in Normal Form Games," Econometrica, 67(4), 827874. Cross, J.G. : (1973), \A Stochastic Learning Model of Economic Behavior," Quaterly Journal of Economics, 87, 239266. Cross, J.G. : (1983), \A Theory of Adaptive Economic Behavior, . Cambridge: Cambridge University Press. Erev, I. and A.E. Roth : (1998), \Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria," American Economic Review, 88(4), 848881. Fudenberg D. and D. Levine : (1998), \Theory of Learning in Games, . MIT Press. Hopkins, E. : (2002), \Two competing models of how people learn in games," Econometrica, 70, 21412166. Hopkins, E. and M. Posch : (2005), \Attainability of boundary points under reinforcement learning," Games and Economic Behavior, 53, 110125. Ianni, A. : (2007), \Learning Strict Nash Equilibrium through Reinforcement," mimeo, EUI Working Paper 2007/21. Izquierdo, L.R., Izquierdo, S.S., Gotts, N.M. and J.G. Polhill : (2007), \Transient and asymptotic dynamics of reinforcement learning in games," Games and Economic Behavior, 61, 259276. Khalil, H.K. : (1996), \Nonlinear Systems, . Prentice Hall. Laslier, J.F., Topol R. and B. Walliser : (2001), \A Behavioral Learning Process in Games," Games and Economic Behavior, 37, 340366. Ljung, L. : (1978), \Strong Convergence of a Stochastic Approximation Algorithm," Annals of Statistics, 6, 680696. Posh, M. : (1997), Cycling in a stochastic learning algorithm for normal form games," Journal of Evolutionary Dynamics, 7, 193207. Ritzberger K. and J. Weibull : (1995), \Evolutionary Selection in normal form games," Econometrica, 63, 13711399. Roth, A. and I. Erev : (1995), \Learning in Extensive Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term," Games and Economic Behavior, 8(1), 164212. Taylor, P. : (1979), \Evolutionary stable strategies with two types of player," Journal of Applied Probability, 16, 7683. VegaRedondo, F. : (2003), \Economics and the Theory of Games, . Cambridge University Press. Weibull J. : (1995), \Evolutionary Game Theory, . MIT Press. 
URI:  https://mpra.ub.unimuenchen.de/id/eprint/33936 