Independent Policy Gradient Methods for Competitive Reinforcement Learning

Dylan J. Foster
Dylan J. Foster
Noah Golowich
Noah Golowich

NeurIPS, 2020.

Cited by: 3|Views1
EI

Abstract:

We obtain global, non-asymptotic convergence guarantees for independent learning algorithms in competitive reinforcement learning settings with two agents (i.e., zero-sum stochastic games). We consider an episodic setting where in each episode, each player independently selects a policy and observes only their own actions and rewards, a...More

Code:

Data:

Your rating :
0

 

Tags
Comments