Mediator Interpretation and Faster Learning Algorithms for Linear Correlated Equilibria in General Extensive-Form Games
arxiv(2023)
摘要
A recent paper by Farina Pipis (2023) established the existence of
uncoupled no-linear-swap regret dynamics with polynomial-time iterations in
extensive-form games. The equilibrium points reached by these dynamics, known
as linear correlated equilibria, are currently the tightest known relaxation of
correlated equilibrium that can be learned in polynomial time in any finite
extensive-form game. However, their properties remain vastly unexplored, and
their computation is onerous. In this paper, we provide several contributions
shedding light on the fundamental nature of linear-swap regret. First, we show
a connection between linear deviations and a generalization of communication
deviations in which the player can make queries to a "mediator" who replies
with action recommendations, and, critically, the player is not constrained to
match the timing of the game as would be the case for communication deviations.
We coin this latter set the untimed communication (UTC) deviations. We show
that the UTC deviations coincide precisely with the linear deviations, and
therefore that any player minimizing UTC regret also minimizes linear-swap
regret. We then leverage this connection to develop state-of-the-art no-regret
algorithms for computing linear correlated equilibria, both in theory and in
practice. In theory, our algorithms achieve polynomially better per-iteration
runtimes; in practice, our algorithms represent the state of the art by several
orders of magnitude.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要