The Advantage Regret-Matching Actor-Critic

Audrūnas Gruslys
Audrūnas Gruslys
Finbarr Timbers
Finbarr Timbers
Martin Schmid
Martin Schmid
Dustin Morrill
Dustin Morrill
Vinicius Zambaldi
Vinicius Zambaldi
Jean-Baptiste Lespiau
Jean-Baptiste Lespiau
John Schultz
John Schultz
Michael Bowling
Michael Bowling
Cited by: 0|Bibtex|Views39|Links

Abstract:

Regret minimization has played a key role in online learning, equilibrium computation in games, and reinforcement learning (RL). In this paper, we describe a general model-free RL method for no-regret learning based on repeated reconsideration of past behavior. We propose a model-free RL algorithm, the AdvantageRegret-Matching Actor-Cri...More

Code:

Data:

Your rating :
0

 

Tags
Comments