A mean-field analysis of two-player zero-sum games

NeurIPS 2020, 2020.

Cited by: 5|Bibtex|Views12|Links
Keywords:
generative adversarial networkhigh dimensionalmean fieldnash equilibriamachine learningMore(14+)
Weibo:
mixed Nash equilibria exist for games in which each player has a continuous loss function, the setting appropriate for optimization problems encountered in machine learning, like generative adversarial network

Abstract:

Finding Nash equilibria in two-player zero-sum continuous games is a central problem in machine learning, e.g. for training both GANs and robust models. The existence of pure Nash equilibria requires strong conditions which are not typically met in practice. Mixed Nash equilibria exist in greater generality and may be found using mirror...More

Code:

Data:

0
Introduction
  • Multi-objective optimization problems arise in many fields, from economics to civil engineering.
  • Game theory provides a lens through which to view multi-agent optimization problems.
  • On the other hand, mixed Nash equilibria (MNE), where agents adopt a strategy from a probability distribution over the set of all strategies, exist in much greater generality [Glicksberg, 1952].
  • MNE exist for games in which each player has a continuous loss function, the setting appropriate for optimization problems encountered in machine learning, like GANs [Goodfellow et al, 2014].
  • Recent work have had empirical success for GAN training—Hsieh et al [2019] report a mirror-prox algorithm that provides convergence guarantees but does not scale to high-dimensional settings
Highlights
  • Multi-objective optimization problems arise in many fields, from economics to civil engineering
  • mixed Nash equilibria (MNE) exist for games in which each player has a continuous loss function, the setting appropriate for optimization problems encountered in machine learning, like generative adversarial network (GAN) [Goodfellow et al, 2014]
  • We observe that while mirror descent performs like WFR-DA in low dimensions, it suffers strongly from the curse of dimensionality (Fig. 1)
  • To validate the usage of WFR-DA on a practical setting, we lift the classical GAN problem into the space of distributions, and train deep neural networks using WFR-DA with backpropagation
  • In our GAN formulation, each generator is associated to a single particle in a high-dimensional product space of all network parameters
  • While the lifting to measures is necessary to have existence of Nash equilibria for general non-convex losses, one could argue that no lifting is required if the focus is solely on Stackelberg equilibria, as these exist in parameter space under mild conditions on the loss
Results
  • The authors observe that while mirror descent performs like WFR-DA in low dimensions, it suffers strongly from the curse of dimensionality (Fig. 1).
  • The authors' purpose is two-fold: (i) to show that solving for the lifted problem (13) gives satisfying results on toy and real data and (ii) to quantify the effect of increasing the number of particles, and the effect of updating weights simultaneously to positions.
  • The different generators identify modes in the real data, performing a form of clustering (Fig. 2 right).
  • Using too few generators or discriminators result in a loss of performance
  • The authors impute this to the training dynamics being too far from its mean-field limit.
Conclusion
  • Conclusions and future work

    In this work the authors have explored non-convex-non-concave, high-dimensional games from the perspective of optimal transport.
  • In the WFR case, the authors are lacking a local convergence analysis that would explain the benefits of transport that the authors observe empirically, for instance leveraging sharpness Polyak-Łojasiewicz results such as those in [Chizat, 2019] or [Sanjabi et al, 2018]
  • Another important open question is to obtain Central Limit Theorems for the convergence of the particle dynamics to the mean field dynamics, in the Langevin, the Wasserstein-Fisher-Rao and the pure mirror descent cases.
  • Finding efficient algorithms with guarantees to identify global Stackelberg equilibria for general losses is an open problem. Fiez et al [2019] propose such an algorithm, but it is second order and has only local guarantees
Summary
  • Introduction:

    Multi-objective optimization problems arise in many fields, from economics to civil engineering.
  • Game theory provides a lens through which to view multi-agent optimization problems.
  • On the other hand, mixed Nash equilibria (MNE), where agents adopt a strategy from a probability distribution over the set of all strategies, exist in much greater generality [Glicksberg, 1952].
  • MNE exist for games in which each player has a continuous loss function, the setting appropriate for optimization problems encountered in machine learning, like GANs [Goodfellow et al, 2014].
  • Recent work have had empirical success for GAN training—Hsieh et al [2019] report a mirror-prox algorithm that provides convergence guarantees but does not scale to high-dimensional settings
  • Objectives:

    The authors' goal is to find an ε-mixed Nash equilibrium (ε-MNE) of the game given by L(μx, μy).
  • Results:

    The authors observe that while mirror descent performs like WFR-DA in low dimensions, it suffers strongly from the curse of dimensionality (Fig. 1).
  • The authors' purpose is two-fold: (i) to show that solving for the lifted problem (13) gives satisfying results on toy and real data and (ii) to quantify the effect of increasing the number of particles, and the effect of updating weights simultaneously to positions.
  • The different generators identify modes in the real data, performing a form of clustering (Fig. 2 right).
  • Using too few generators or discriminators result in a loss of performance
  • The authors impute this to the training dynamics being too far from its mean-field limit.
  • Conclusion:

    Conclusions and future work

    In this work the authors have explored non-convex-non-concave, high-dimensional games from the perspective of optimal transport.
  • In the WFR case, the authors are lacking a local convergence analysis that would explain the benefits of transport that the authors observe empirically, for instance leveraging sharpness Polyak-Łojasiewicz results such as those in [Chizat, 2019] or [Sanjabi et al, 2018]
  • Another important open question is to obtain Central Limit Theorems for the convergence of the particle dynamics to the mean field dynamics, in the Langevin, the Wasserstein-Fisher-Rao and the pure mirror descent cases.
  • Finding efficient algorithms with guarantees to identify global Stackelberg equilibria for general losses is an open problem. Fiez et al [2019] propose such an algorithm, but it is second order and has only local guarantees
Related work
  • Equilibria in Continuous Games: While many algorithms and methods have been proposed to identify MNE [Mertikopoulos et al, 2019, Lin et al, 2018, Nouiehed et al, 2019], to our knowledge very few have focused on the setting of non-convex non-concave games with a continuous strategy space. Many of the relevant studies have dealt with training GANs using gradient descent/ascent (GDA): Heusel et al [2017] demonstrated that under certain strong conditions local Nash equilibria are stable fixed points of GDA in GANs training; Adolphs et al [2018] and Mazumdar et al [2019] propose Hessian-based algorithms whose stable fixed points are exactly local Nash equilibria; Jin et al [2019] define the notion of local minimax and show that these points are almost all equal to the stable limit points of GDA. Hsieh et al [2019] studied mirror-descent and mirror-prox on measures, providing convergence guarantees for GAN training. In the context of games, Balduzzi et al [2018] develop a symplectic gradient adjustment (SGA) algorithm for finding stable fixed points in potential games and Hamiltonian games. These works contrast with our point of view, aimed at guaranteeing convergence of the dynamics to an approximate MNE.

    Equilibria in GANs: Arora et al [2017] proved the existence of approximate MNE and studied the generalization properties of this approximate solution; their analysis, however, does not provide a constructive method to identify such a solution. In a more explicit setting, Grnarova et al [2017] designed an online-learning algorithm for finding a MNE in GANs under the assumption that the discriminator is a single hidden layer neural network. Our framework holds without making any assumption on the architectures of the discriminator and generator and provides explicit algorithms with convergence guarantees.
Funding
  • This work is partially supported by the Alfred P
  • DomingoEnrich is partially supported by the La Caixa Fellowship
  • Mensch is supported by the
Reference
  • L. Adolphs, H. Daneshmand, A. Lucchi, and T. Hofmann. Local saddle point optimization: A curvature exploitation approach. arXiv preprint arXiv:1805.05751, 2018.
    Findings
  • M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
    Findings
  • D. G. Aronson and J. Serrin. Local behavior of solutions of quasilinear parabolic equations. Arch. Rational Mech. Anal., 25(2):81–122, Jan. 1967. ISSN 0003-9527, 1432-067doi: 10/dwv9tq.
    Google ScholarLocate open access versionFindings
  • S. Arora, R. Ge, Y. Liang, T. Ma, and Y. Zhang. Generalization and equilibrium in generative adversarial nets (GANs). In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 224–232. JMLR. org, 2017.
    Google ScholarLocate open access versionFindings
  • M. Balandat, W. Krichene, C. Tomlin, and A. Bayen. Minimizing regret on reflexive banach spaces and nash equilibria in continuous zero-sum games. In Advances in Neural Information Processing Systems, pages 154–162, 2016.
    Google ScholarLocate open access versionFindings
  • D. Balduzzi, S. Racanière, J. Martens, J. Foerster, K. Tuyls, and T. Graepel. The mechanics of n-player differentiable games. In Proceedings of the International Conference on Machine Learning, 2018.
    Google ScholarLocate open access versionFindings
  • L. Bu, R. Babu, B. De Schutter, et al. A comprehensive survey of multi-agent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2):156–172, 2008.
    Google ScholarLocate open access versionFindings
  • M. Caron, P. Bojanowski, A. Joulin, and M. Douze. Deep clustering for unsupervised learning of visual features. 2019.
    Google ScholarFindings
  • G. Chirikjian. Stochastic models, information theory, and Lie groups. Number v. 1 in Applied and numerical harmonic analysis. Birkhäuser, 200ISBN 9780817672508. URL https://books.google.ca/books?id=lfOMoAEACAAJ.
    Locate open access versionFindings
  • L. Chizat. Sparse optimization on measures with over-parameterized gradient descent. arXiv preprint arXiv:1907.10300v1, 2019.
    Findings
  • L. Chizat and F. Bach. On the global convergence of gradient descent for over-parameterized models using optimal transport. In Advances in neural information processing systems, pages 3036–3046, 2018.
    Google ScholarLocate open access versionFindings
  • L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard. Unbalanced optimal transport: Dynamic and Kantorovich formulation. arXiv preprint arXiv:1508.05216, 2015.
    Findings
  • L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard. An interpolating distance between optimal transport and Fisher–Rao metrics. 18(1):1–44, 2018. ISSN 1615-3383. doi: 10/gcw6xw. URL https://doi.org/10.1007/s10208-016-9331-y.tex.day:01.
    Locate open access versionFindings
  • C. Daskalakis, P. W. Goldberg, and C. H. Papadimitriou. The complexity of computing a Nash equilibrium. SIAM Journal on Computing, 39(1):195–259, 2009.
    Google ScholarLocate open access versionFindings
  • A. Eberle, A. Guillin, and R. Zimmer. Quantitative Harris-type theorems for diffusions and McKean-Vlasov processes. Trans. Amer. Math. Soc., 371:7135–7173, 2019.
    Google ScholarLocate open access versionFindings
  • T. Fiez, B. Chasnov, and L. J. Ratliff. Convergence of learning dynamics in stackelberg games, 2019.
    Google ScholarFindings
  • A. Ghosh, V. Kulharia, V. Namboodiri, P. H. S. Torr, and P. K. Dokania. Multi-agent diverse generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
    Google ScholarLocate open access versionFindings
  • G. Gidel, H. Berard, G. Vignoud, P. Vincent, and S. Lacoste-Julien. A variational inequality perspective on generative adversarial networks. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • I. L. Glicksberg. A further generalization of the Kakutani fixed point theorem, with application to Nash equilibrium points. Proceedings of the American Mathematical Society, 3(1):170–174, 1952.
    Google ScholarLocate open access versionFindings
  • I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems, pages 2672–2680, 2014.
    Google ScholarLocate open access versionFindings
  • P. Grnarova, K. Y. Levy, A. Lucchi, T. Hofmann, and A. Krause. An online learning approach to generative adversarial networks. arXiv preprint arXiv:1706.03269, 2017.
    Findings
  • I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville. Improved training of Wasserstein GANs. In Advances in Neural Information Processing Systems, pages 5767–5777, 2017.
    Google ScholarLocate open access versionFindings
  • M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Advances in Neural Information Processing Systems, pages 6626–6637, 2017.
    Google ScholarLocate open access versionFindings
  • Y.-P. Hsieh, C. Liu, and V. Cevher. Finding mixed Nash equilibria of generative adversarial networks. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2810–2819, Long Beach, California, USA, 06 2019. PMLR. URL http://proceedings.mlr.press/v97/hsieh19b.html.
    Locate open access versionFindings
  • C. Jin, P. Netrapalli, and M. I. Jordan. Minmax optimization: Stable limit points of gradient descent ascent are locally optimal. arXiv preprint arXiv:1902.00618, 2019.
    Findings
  • A. Juditsky, A. Nemirovski, and C. Tauvel. Solving variational inequalities with stochastic mirror-prox algorithm. Stochastic Systems, 1(1):17–58, 2011.
    Google ScholarLocate open access versionFindings
  • O. Kallenberg. Foundations of Modern Probability. Probability and Its Applications. Springer New York, 2002. ISBN 9780387953137. URL https://books.google.es/books?id=L6fhXh13OyMC.
    Findings
  • S. Kondratyev, L. Monsaingeon, D. Vorotnikov, et al. A new optimal transport distance on the space of finite Radon measures. Advances in Differential Equations, 21(11/12):1117–1164, 2016.
    Google ScholarLocate open access versionFindings
  • D. Lacker. Mean field games and interacting particle systems. Preprint, 2018.
    Google ScholarFindings
  • Q. Lei, J. D. Lee, A. G. Dimakis, and C. Daskalakis. SGD learns one-layer networks in WGANs. arXiv preprint arXiv:1910.07030, 2019.
    Findings
  • M. Liero, A. Mielke, and G. Savaré. Optimal entropy-transport problems and a new Hellinger–Kantorovich distance between positive measures. Inventiones mathematicae, 211(3):969–1117, 03 2018. ISSN 1432-1297. doi: 10.1007/s00222-017-0759-8. URL https://doi.org/10.1007/s00222-017-0759-8.
    Locate open access versionFindings
  • Q. Lin, M. Liu, H. Rafique, and T. Yang. Solving weakly-convex-weakly-concave saddle-point problems as weakly-monotone variational inequality. arXiv preprint arXiv:1810.10207, 2018.
    Findings
  • P. A. Markowich and C. Villani. On the trend to equilibrium for the fokker-planck equation: An interplay between physics and functional analysis. In Physics and Functional Analysis, Matematica Contemporanea (SBM) 19, pages 1–29, 1999.
    Google ScholarLocate open access versionFindings
  • E. V. Mazumdar, M. I. Jordan, and S. S. Sastry. On finding local Nash equilibria (and only local Nash equilibria) in zero-sum games. arXiv:1901.00838, 2019.
    Findings
  • S. Mei, A. Montanari, and P.-M. Nguyen. A mean field view of the landscape of two-layer neural networks. Proceedings of the National Academy of Sciences, 115(33):E7665–E7671, 2018.
    Google ScholarLocate open access versionFindings
  • P. Mertikopoulos, B. Lecouat, H. Zenati, C.-S. Foo, V. Chandrasekhar, and G. Piliouras. Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • J. Nash. Non-cooperative games. Annals of Mathematics, pages 286–295, 1951.
    Google ScholarLocate open access versionFindings
  • A. Nemirovski. Prox-method with rate of convergence o(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2004.
    Google ScholarLocate open access versionFindings
  • H. Nikaidô and K. Isoda. Note on non-cooperative convex games. Pacific Journal of Mathematics, 5(Suppl. 1):807–815, 1955.
    Google ScholarLocate open access versionFindings
  • M. Nouiehed, M. Sanjabi, T. Huang, J. D. Lee, and M. Razaviyayn. Solving a class of non-convex min-max games using iterative first order methods. In Advances in Neural Information Processing Systems, pages 14905–14916, 2019.
    Google ScholarLocate open access versionFindings
  • G. Pavliotis. Stochastic Processes and Applications: Diffusion Processes, the Fokker-Planck and Langevin Equations. Texts in Applied Mathematics. Springer New York, 2014. ISBN 9781493913220. URL https://books.google.com/books?id=mpHFoAEACAAJ.
    Findings
  • A. Porretta. Weak Solutions to Fokker–Planck Equations and Mean Field Games. Arch. Rational Mech. Anal., 216(1):1–62, Apr. 2015. ISSN 0003-9527, 1432-0673. doi: 10/f64gfr.
    Google ScholarLocate open access versionFindings
  • E. C. Posner. Random coding strategies for minimum entropy. IEEE Transations on Information Theory, 21 (4):388–391, 1975.
    Google ScholarLocate open access versionFindings
  • S. Racanière, T. Weber, D. Reichert, L. Buesing, A. Guez, D. J. Rezende, A. P. Badia, O. Vinyals, N. Heess, Y. Li, et al. Imagination-augmented agents for deep reinforcement learning. In Advances in Neural Information Processing Systems, pages 5690–5701, 2017.
    Google ScholarLocate open access versionFindings
  • J. B. Rosen. Existence and uniqueness of equilibrium points for concave n-person games. Econometrica, 33(3): 520–534, 1965.
    Google ScholarLocate open access versionFindings
  • G. Rotskoff, S. Jelassi, J. Bruna, and E. Vanden-Eijnden. Global convergence of neuron birth-death dynamics. arXiv preprint arXiv:1902.01843, 2019.
    Findings
  • G. M. Rotskoff and E. Vanden-Eijnden. Neural networks as interacting particle systems: Asymptotic convexity of the loss landscape and universal scaling of the approximation error. arXiv preprint arXiv:1805.00915, 2018.
    Findings
  • M. Sanjabi, M. Razaviyayn, and J. D. Lee. Solving non-convex non-concave min-max games under polyaklojasiewicz condition. arXiv preprint arXiv:1812.02878, 2018.
    Findings
  • F. Santambrogio. {Euclidean, metric, and Wasserstein} gradient flows: an overview. Bulletin of Mathematical Sciences, 7(1):87–154, 2017.
    Google ScholarLocate open access versionFindings
  • J. Sirignano and K. Spiliopoulos. Mean field analysis of neural networks: A central limit theorem. Stochastic Processes and their Applications, 2019.
    Google ScholarLocate open access versionFindings
  • A.-S. Sznitman. Topics in propagation of chaos. In P.-L. Hennequin, editor, Ecole d’Eté de Probabilités de Saint-Flour XIX — 1989, pages 165–251, Berlin, Heidelberg, 1991. Springer Berlin Heidelberg. ISBN 978-3-540-46319-1.
    Google ScholarLocate open access versionFindings
  • G. Wayne and L. Abbott. Hierarchical control using networks trained with higher-level forward models. Neural Computation, 26(10):2163–2193, 2014.
    Google ScholarLocate open access versionFindings
  • 1 vol(X
    Google ScholarFindings
  • Notice that vol(X )
    Google ScholarFindings
  • (x,y)vol(X ) < ∞.
    Google ScholarFindings
  • Vol(B(xmin, δ))e−β(minx∈C1 Vx(x)+ε/2), (1 − Vol(B(xmin, δ)))e−β(minx∈C1 Vx(x)+ε).
    Google ScholarFindings
  • Thus, the following condition is sufficient: Vol(B(xmin, δ)) − Vol(B(xmin, δ))
    Google ScholarFindings
  • − Vol(B(xmin, δ Vol(B(xmin, δ))
    Google ScholarFindings
Your rating :
0

 

Tags
Comments