Model-free policy iteration approach to NCE-based strategy design for linear quadratic Gaussian games.

Autom.(2023)

引用 0|浏览13
暂无评分
摘要
This paper presents a novel completely model-free method to solve continuous-time linear quadratic (LQ) mean field games (MFGs). We consider the setting where agents can only access their own states and are coupled via their individual infinite-horizon discounted costs. For such games, a set of Nash certainty equivalence-based strategies, determined by two gain matrices and the aggregate population effect, i.e. mean field (MF) state, are developed, providing an & epsilon;-Nash equilibrium under certain assumptions. Note that the gain matrices are solved from two algebraic Riccati equations (AREs), respectively, and the MF state is given by the linear ordinary differential equation (ODE). All the solutions of these three equations require the exact model of system dynamics. To implement the NCE-based strategy with completely unknown system dynamics, the proposed approach develops a single-agent policy iteration (PI) and a single-agent MF generator to solve the above parameters. Specifically, first, model-based iterative equations with guaranteed convergence are developed to approximate two gain matrices. Then, the requirement of dynamical model is removed from the iteration process via the integral reinforcement learning (IRL) technique. A model-free algorithm is conducted by using measured data from a selected agent as reinforcement signals over a certain time interval. Finally, based on the obtained gain matrices, the MF state is computed offline depending on samples collected from the same agent. A numerical example is given to verify the effectiveness of the proposed method.& COPY; 2023 Elsevier Ltd. All rights reserved.
更多
查看译文
关键词
Mean field game, Linear quadratic control, Infinite horizon control, Reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要