Abstract |
Similar to the role of Markov decision processes in reinforcement learning, Markov games (also known as stochastic games) form the basis for the study of multi-agent reinforcement learning and sequence-agent interaction. We introduce an approximate Markov perfect equilibrium as a computational problem for solving finite-state stochastic games under infinite time discounting, and prove its PPAD completeness. This solution concept preserves the Markovian-perfect property, opening the possibility to extend successful multi-agent reinforcement learning algorithms to multi-agent dynamic games, thus extending the range of PPAD complete classes. |