# A mean-field analysis of two-player zero-sum games

NeurIPS 2020, 2020.

Keywords:

Weibo:

Abstract:

Finding Nash equilibria in two-player zero-sum continuous games is a central problem in machine learning, e.g. for training both GANs and robust models. The existence of pure Nash equilibria requires strong conditions which are not typically met in practice. Mixed Nash equilibria exist in greater generality and may be found using mirror...More

Code:

Data:

Introduction

- Multi-objective optimization problems arise in many fields, from economics to civil engineering.
- Game theory provides a lens through which to view multi-agent optimization problems.
- On the other hand, mixed Nash equilibria (MNE), where agents adopt a strategy from a probability distribution over the set of all strategies, exist in much greater generality [Glicksberg, 1952].
- MNE exist for games in which each player has a continuous loss function, the setting appropriate for optimization problems encountered in machine learning, like GANs [Goodfellow et al, 2014].
- Recent work have had empirical success for GAN training—Hsieh et al [2019] report a mirror-prox algorithm that provides convergence guarantees but does not scale to high-dimensional settings

Highlights

- Multi-objective optimization problems arise in many fields, from economics to civil engineering
- mixed Nash equilibria (MNE) exist for games in which each player has a continuous loss function, the setting appropriate for optimization problems encountered in machine learning, like generative adversarial network (GAN) [Goodfellow et al, 2014]
- We observe that while mirror descent performs like WFR-DA in low dimensions, it suffers strongly from the curse of dimensionality (Fig. 1)
- To validate the usage of WFR-DA on a practical setting, we lift the classical GAN problem into the space of distributions, and train deep neural networks using WFR-DA with backpropagation
- In our GAN formulation, each generator is associated to a single particle in a high-dimensional product space of all network parameters
- While the lifting to measures is necessary to have existence of Nash equilibria for general non-convex losses, one could argue that no lifting is required if the focus is solely on Stackelberg equilibria, as these exist in parameter space under mild conditions on the loss

Results

- The authors observe that while mirror descent performs like WFR-DA in low dimensions, it suffers strongly from the curse of dimensionality (Fig. 1).
- The authors' purpose is two-fold: (i) to show that solving for the lifted problem (13) gives satisfying results on toy and real data and (ii) to quantify the effect of increasing the number of particles, and the effect of updating weights simultaneously to positions.
- The different generators identify modes in the real data, performing a form of clustering (Fig. 2 right).
- Using too few generators or discriminators result in a loss of performance
- The authors impute this to the training dynamics being too far from its mean-field limit.

Conclusion

**Conclusions and future work**

In this work the authors have explored non-convex-non-concave, high-dimensional games from the perspective of optimal transport.- In the WFR case, the authors are lacking a local convergence analysis that would explain the benefits of transport that the authors observe empirically, for instance leveraging sharpness Polyak-Łojasiewicz results such as those in [Chizat, 2019] or [Sanjabi et al, 2018]
- Another important open question is to obtain Central Limit Theorems for the convergence of the particle dynamics to the mean field dynamics, in the Langevin, the Wasserstein-Fisher-Rao and the pure mirror descent cases.
- Finding efficient algorithms with guarantees to identify global Stackelberg equilibria for general losses is an open problem. Fiez et al [2019] propose such an algorithm, but it is second order and has only local guarantees

Summary

## Introduction:

Multi-objective optimization problems arise in many fields, from economics to civil engineering.- Game theory provides a lens through which to view multi-agent optimization problems.
- On the other hand, mixed Nash equilibria (MNE), where agents adopt a strategy from a probability distribution over the set of all strategies, exist in much greater generality [Glicksberg, 1952].
- MNE exist for games in which each player has a continuous loss function, the setting appropriate for optimization problems encountered in machine learning, like GANs [Goodfellow et al, 2014].
- Recent work have had empirical success for GAN training—Hsieh et al [2019] report a mirror-prox algorithm that provides convergence guarantees but does not scale to high-dimensional settings
## Objectives:

The authors' goal is to find an ε-mixed Nash equilibrium (ε-MNE) of the game given by L(μx, μy).## Results:

The authors observe that while mirror descent performs like WFR-DA in low dimensions, it suffers strongly from the curse of dimensionality (Fig. 1).- The authors' purpose is two-fold: (i) to show that solving for the lifted problem (13) gives satisfying results on toy and real data and (ii) to quantify the effect of increasing the number of particles, and the effect of updating weights simultaneously to positions.
- The different generators identify modes in the real data, performing a form of clustering (Fig. 2 right).
- Using too few generators or discriminators result in a loss of performance
- The authors impute this to the training dynamics being too far from its mean-field limit.
## Conclusion:

**Conclusions and future work**

In this work the authors have explored non-convex-non-concave, high-dimensional games from the perspective of optimal transport.- In the WFR case, the authors are lacking a local convergence analysis that would explain the benefits of transport that the authors observe empirically, for instance leveraging sharpness Polyak-Łojasiewicz results such as those in [Chizat, 2019] or [Sanjabi et al, 2018]
- Another important open question is to obtain Central Limit Theorems for the convergence of the particle dynamics to the mean field dynamics, in the Langevin, the Wasserstein-Fisher-Rao and the pure mirror descent cases.
- Finding efficient algorithms with guarantees to identify global Stackelberg equilibria for general losses is an open problem. Fiez et al [2019] propose such an algorithm, but it is second order and has only local guarantees

Related work

- Equilibria in Continuous Games: While many algorithms and methods have been proposed to identify MNE [Mertikopoulos et al, 2019, Lin et al, 2018, Nouiehed et al, 2019], to our knowledge very few have focused on the setting of non-convex non-concave games with a continuous strategy space. Many of the relevant studies have dealt with training GANs using gradient descent/ascent (GDA): Heusel et al [2017] demonstrated that under certain strong conditions local Nash equilibria are stable fixed points of GDA in GANs training; Adolphs et al [2018] and Mazumdar et al [2019] propose Hessian-based algorithms whose stable fixed points are exactly local Nash equilibria; Jin et al [2019] define the notion of local minimax and show that these points are almost all equal to the stable limit points of GDA. Hsieh et al [2019] studied mirror-descent and mirror-prox on measures, providing convergence guarantees for GAN training. In the context of games, Balduzzi et al [2018] develop a symplectic gradient adjustment (SGA) algorithm for finding stable fixed points in potential games and Hamiltonian games. These works contrast with our point of view, aimed at guaranteeing convergence of the dynamics to an approximate MNE.

Equilibria in GANs: Arora et al [2017] proved the existence of approximate MNE and studied the generalization properties of this approximate solution; their analysis, however, does not provide a constructive method to identify such a solution. In a more explicit setting, Grnarova et al [2017] designed an online-learning algorithm for finding a MNE in GANs under the assumption that the discriminator is a single hidden layer neural network. Our framework holds without making any assumption on the architectures of the discriminator and generator and provides explicit algorithms with convergence guarantees.

Funding

- This work is partially supported by the Alfred P
- DomingoEnrich is partially supported by the La Caixa Fellowship
- Mensch is supported by the

Reference

- L. Adolphs, H. Daneshmand, A. Lucchi, and T. Hofmann. Local saddle point optimization: A curvature exploitation approach. arXiv preprint arXiv:1805.05751, 2018.
- M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
- D. G. Aronson and J. Serrin. Local behavior of solutions of quasilinear parabolic equations. Arch. Rational Mech. Anal., 25(2):81–122, Jan. 1967. ISSN 0003-9527, 1432-067doi: 10/dwv9tq.
- S. Arora, R. Ge, Y. Liang, T. Ma, and Y. Zhang. Generalization and equilibrium in generative adversarial nets (GANs). In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 224–232. JMLR. org, 2017.
- M. Balandat, W. Krichene, C. Tomlin, and A. Bayen. Minimizing regret on reflexive banach spaces and nash equilibria in continuous zero-sum games. In Advances in Neural Information Processing Systems, pages 154–162, 2016.
- D. Balduzzi, S. Racanière, J. Martens, J. Foerster, K. Tuyls, and T. Graepel. The mechanics of n-player differentiable games. In Proceedings of the International Conference on Machine Learning, 2018.
- L. Bu, R. Babu, B. De Schutter, et al. A comprehensive survey of multi-agent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2):156–172, 2008.
- M. Caron, P. Bojanowski, A. Joulin, and M. Douze. Deep clustering for unsupervised learning of visual features. 2019.
- G. Chirikjian. Stochastic models, information theory, and Lie groups. Number v. 1 in Applied and numerical harmonic analysis. Birkhäuser, 200ISBN 9780817672508. URL https://books.google.ca/books?id=lfOMoAEACAAJ.
- L. Chizat. Sparse optimization on measures with over-parameterized gradient descent. arXiv preprint arXiv:1907.10300v1, 2019.
- L. Chizat and F. Bach. On the global convergence of gradient descent for over-parameterized models using optimal transport. In Advances in neural information processing systems, pages 3036–3046, 2018.
- L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard. Unbalanced optimal transport: Dynamic and Kantorovich formulation. arXiv preprint arXiv:1508.05216, 2015.
- L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard. An interpolating distance between optimal transport and Fisher–Rao metrics. 18(1):1–44, 2018. ISSN 1615-3383. doi: 10/gcw6xw. URL https://doi.org/10.1007/s10208-016-9331-y.tex.day:01.
- C. Daskalakis, P. W. Goldberg, and C. H. Papadimitriou. The complexity of computing a Nash equilibrium. SIAM Journal on Computing, 39(1):195–259, 2009.
- A. Eberle, A. Guillin, and R. Zimmer. Quantitative Harris-type theorems for diffusions and McKean-Vlasov processes. Trans. Amer. Math. Soc., 371:7135–7173, 2019.
- T. Fiez, B. Chasnov, and L. J. Ratliff. Convergence of learning dynamics in stackelberg games, 2019.
- A. Ghosh, V. Kulharia, V. Namboodiri, P. H. S. Torr, and P. K. Dokania. Multi-agent diverse generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
- G. Gidel, H. Berard, G. Vignoud, P. Vincent, and S. Lacoste-Julien. A variational inequality perspective on generative adversarial networks. In International Conference on Learning Representations, 2019.
- I. L. Glicksberg. A further generalization of the Kakutani fixed point theorem, with application to Nash equilibrium points. Proceedings of the American Mathematical Society, 3(1):170–174, 1952.
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems, pages 2672–2680, 2014.
- P. Grnarova, K. Y. Levy, A. Lucchi, T. Hofmann, and A. Krause. An online learning approach to generative adversarial networks. arXiv preprint arXiv:1706.03269, 2017.
- I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville. Improved training of Wasserstein GANs. In Advances in Neural Information Processing Systems, pages 5767–5777, 2017.
- M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Advances in Neural Information Processing Systems, pages 6626–6637, 2017.
- Y.-P. Hsieh, C. Liu, and V. Cevher. Finding mixed Nash equilibria of generative adversarial networks. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2810–2819, Long Beach, California, USA, 06 2019. PMLR. URL http://proceedings.mlr.press/v97/hsieh19b.html.
- C. Jin, P. Netrapalli, and M. I. Jordan. Minmax optimization: Stable limit points of gradient descent ascent are locally optimal. arXiv preprint arXiv:1902.00618, 2019.
- A. Juditsky, A. Nemirovski, and C. Tauvel. Solving variational inequalities with stochastic mirror-prox algorithm. Stochastic Systems, 1(1):17–58, 2011.
- O. Kallenberg. Foundations of Modern Probability. Probability and Its Applications. Springer New York, 2002. ISBN 9780387953137. URL https://books.google.es/books?id=L6fhXh13OyMC.
- S. Kondratyev, L. Monsaingeon, D. Vorotnikov, et al. A new optimal transport distance on the space of finite Radon measures. Advances in Differential Equations, 21(11/12):1117–1164, 2016.
- D. Lacker. Mean field games and interacting particle systems. Preprint, 2018.
- Q. Lei, J. D. Lee, A. G. Dimakis, and C. Daskalakis. SGD learns one-layer networks in WGANs. arXiv preprint arXiv:1910.07030, 2019.
- M. Liero, A. Mielke, and G. Savaré. Optimal entropy-transport problems and a new Hellinger–Kantorovich distance between positive measures. Inventiones mathematicae, 211(3):969–1117, 03 2018. ISSN 1432-1297. doi: 10.1007/s00222-017-0759-8. URL https://doi.org/10.1007/s00222-017-0759-8.
- Q. Lin, M. Liu, H. Rafique, and T. Yang. Solving weakly-convex-weakly-concave saddle-point problems as weakly-monotone variational inequality. arXiv preprint arXiv:1810.10207, 2018.
- P. A. Markowich and C. Villani. On the trend to equilibrium for the fokker-planck equation: An interplay between physics and functional analysis. In Physics and Functional Analysis, Matematica Contemporanea (SBM) 19, pages 1–29, 1999.
- E. V. Mazumdar, M. I. Jordan, and S. S. Sastry. On finding local Nash equilibria (and only local Nash equilibria) in zero-sum games. arXiv:1901.00838, 2019.
- S. Mei, A. Montanari, and P.-M. Nguyen. A mean field view of the landscape of two-layer neural networks. Proceedings of the National Academy of Sciences, 115(33):E7665–E7671, 2018.
- P. Mertikopoulos, B. Lecouat, H. Zenati, C.-S. Foo, V. Chandrasekhar, and G. Piliouras. Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile. In International Conference on Learning Representations, 2019.
- J. Nash. Non-cooperative games. Annals of Mathematics, pages 286–295, 1951.
- A. Nemirovski. Prox-method with rate of convergence o(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2004.
- H. Nikaidô and K. Isoda. Note on non-cooperative convex games. Pacific Journal of Mathematics, 5(Suppl. 1):807–815, 1955.
- M. Nouiehed, M. Sanjabi, T. Huang, J. D. Lee, and M. Razaviyayn. Solving a class of non-convex min-max games using iterative first order methods. In Advances in Neural Information Processing Systems, pages 14905–14916, 2019.
- G. Pavliotis. Stochastic Processes and Applications: Diffusion Processes, the Fokker-Planck and Langevin Equations. Texts in Applied Mathematics. Springer New York, 2014. ISBN 9781493913220. URL https://books.google.com/books?id=mpHFoAEACAAJ.
- A. Porretta. Weak Solutions to Fokker–Planck Equations and Mean Field Games. Arch. Rational Mech. Anal., 216(1):1–62, Apr. 2015. ISSN 0003-9527, 1432-0673. doi: 10/f64gfr.
- E. C. Posner. Random coding strategies for minimum entropy. IEEE Transations on Information Theory, 21 (4):388–391, 1975.
- S. Racanière, T. Weber, D. Reichert, L. Buesing, A. Guez, D. J. Rezende, A. P. Badia, O. Vinyals, N. Heess, Y. Li, et al. Imagination-augmented agents for deep reinforcement learning. In Advances in Neural Information Processing Systems, pages 5690–5701, 2017.
- J. B. Rosen. Existence and uniqueness of equilibrium points for concave n-person games. Econometrica, 33(3): 520–534, 1965.
- G. Rotskoff, S. Jelassi, J. Bruna, and E. Vanden-Eijnden. Global convergence of neuron birth-death dynamics. arXiv preprint arXiv:1902.01843, 2019.
- G. M. Rotskoff and E. Vanden-Eijnden. Neural networks as interacting particle systems: Asymptotic convexity of the loss landscape and universal scaling of the approximation error. arXiv preprint arXiv:1805.00915, 2018.
- M. Sanjabi, M. Razaviyayn, and J. D. Lee. Solving non-convex non-concave min-max games under polyaklojasiewicz condition. arXiv preprint arXiv:1812.02878, 2018.
- F. Santambrogio. {Euclidean, metric, and Wasserstein} gradient flows: an overview. Bulletin of Mathematical Sciences, 7(1):87–154, 2017.
- J. Sirignano and K. Spiliopoulos. Mean field analysis of neural networks: A central limit theorem. Stochastic Processes and their Applications, 2019.
- A.-S. Sznitman. Topics in propagation of chaos. In P.-L. Hennequin, editor, Ecole d’Eté de Probabilités de Saint-Flour XIX — 1989, pages 165–251, Berlin, Heidelberg, 1991. Springer Berlin Heidelberg. ISBN 978-3-540-46319-1.
- G. Wayne and L. Abbott. Hierarchical control using networks trained with higher-level forward models. Neural Computation, 26(10):2163–2193, 2014.
- 1 vol(X
- Notice that vol(X )
- (x,y)vol(X ) < ∞.
- Vol(B(xmin, δ))e−β(minx∈C1 Vx(x)+ε/2), (1 − Vol(B(xmin, δ)))e−β(minx∈C1 Vx(x)+ε).
- Thus, the following condition is sufficient: Vol(B(xmin, δ)) − Vol(B(xmin, δ))
- − Vol(B(xmin, δ Vol(B(xmin, δ))

Tags

Comments