Training Generative Adversarial Networks by Solving Ordinary Differential Equations

NIPS 2020, 2020.

Cited by: 2|Views472
EI
Weibo:
Our work explores higher-order approximations of the continuous dynamics induced by Generative Adversarial Network training

Abstract:

The instability of Generative Adversarial Network (GAN) training has frequently been attributed to gradient descent. Consequently, recent methods have aimed to tailor the models and training procedures to stabilise the discrete updates. In contrast, we study the continuous-time dynamics induced by GAN training. Both theory and toy exper...More
0
Full Text
Bibtex
Weibo
Introduction
  • The training of Generative Adversarial Networks (GANs) [12] has seen significant advances over the past several years.
  • The problem is often transformed into one where the objective function is asymmetric (e.g., the generator’s objective is changed to minφ Ez [− log D(G(z; φ), θ)]).
  • The authors can describe this more general setting, which the authors focus on here, by using (θ, φ) = [ D(θ, φ), G(θ, φ)], (2).
Highlights
  • The training of Generative Adversarial Networks (GANs) [12] has seen significant advances over the past several years
  • We study the continuous-time dynamics induced by gradient descent on the GAN objective for commonly used losses
  • We show that higher-order ordinary differential equations (ODEs) solvers lead to better convergence for GANs
  • Our experiments reveal that using more accurate ODE solvers results in loss profiles that differ significantly to curves observed in standard GAN training, as shown in Fig. 5
  • Our work explores higher-order approximations of the continuous dynamics induced by GAN training
  • The dynamical systems perspective has been employed for analysing GANs in previous works [24, 30, 20, 2, 11]
Conclusion
  • Discussion and Relation to Existing Work

    The authors' work explores higher-order approximations of the continuous dynamics induced by GAN training.
  • The dynamical systems perspective has been employed for analysing GANs in previous works [24, 30, 20, 2, 11].
  • Others made related connections: for example, using a second order ODE integrator was considered in a simple 1-D case for GANs in Gemp and Mahadevan [11], and Nagarajan and Kolter [24] analysed the continuous dynamics in a more restrictive setting – in a min-max game around the optimal solution.
  • The authors hope that the paper can encourage more work in the direction of this connection [30], and adds to the valuable body of work on analysing GAN training convergence [31, 10, 21]
Summary
  • Introduction:

    The training of Generative Adversarial Networks (GANs) [12] has seen significant advances over the past several years.
  • The problem is often transformed into one where the objective function is asymmetric (e.g., the generator’s objective is changed to minφ Ez [− log D(G(z; φ), θ)]).
  • The authors can describe this more general setting, which the authors focus on here, by using (θ, φ) = [ D(θ, φ), G(θ, φ)], (2).
  • Conclusion:

    Discussion and Relation to Existing Work

    The authors' work explores higher-order approximations of the continuous dynamics induced by GAN training.
  • The dynamical systems perspective has been employed for analysing GANs in previous works [24, 30, 20, 2, 11].
  • Others made related connections: for example, using a second order ODE integrator was considered in a simple 1-D case for GANs in Gemp and Mahadevan [11], and Nagarajan and Kolter [24] analysed the continuous dynamics in a more restrictive setting – in a min-max game around the optimal solution.
  • The authors hope that the paper can encourage more work in the direction of this connection [30], and adds to the valuable body of work on analysing GAN training convergence [31, 10, 21]
Tables
  • Table1: Numbers taken from the literature are cited. ‡ denotes reproduction in our code. “Best” and “final” indicate the best scores and scores at the end of 1 million update steps respectively. The means and standard deviations (shown by ±) are computed from 3 runs with different random seeds. We use bold face for the best scores across each category incl. those within one standard deviation
  • Table2: Comparison of ODE-GAN and the SN-GAN trained on Ima-
  • Table3: Ablation Studies for ODE-GAN for CIFAR-10 (DCGAN)
Download tables as Excel
Reference
  • Martin Arjovsky and Léon Bottou. Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862, 2017.
    Findings
  • David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, and Thore Graepel. The mechanics of n-player differentiable games. arXiv preprint arXiv:1802.05642, 2018.
    Findings
  • Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018.
    Findings
  • Corneel Casert, Kyle Mills, Tom Vieijra, Jan Ryckebusch, and Isaac Tamblyn. Optical lattice experiments at unobserved conditions and scales through generative adversarial deep learning. arXiv preprint arXiv:2002.07055, 2020.
    Findings
  • Tatjana Chavdarova, Gauthier Gidel, François Fleuret, and Simon Lacoste-Julien. Reducing noise in gan training with variance reduced extragradient. In Advances in Neural Information Processing Systems, pages 391–401, 2019.
    Google ScholarLocate open access versionFindings
  • Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. Neural ordinary differential equations. In Advances in neural information processing systems, pages 6571–6583, 2018.
    Google ScholarLocate open access versionFindings
  • J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009.
    Google ScholarLocate open access versionFindings
  • Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, and Cristian Canton Ferrer. The deepfake detection challenge (dfdc) preview dataset. arXiv preprint arXiv:1910.08854, 2019.
    Findings
  • Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. A learned representation for artistic style. In ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Tanner Fiez, Benjamin Chasnov, and Lillian J Ratliff. Convergence of learning dynamics in stackelberg games. arXiv preprint arXiv:1906.01217, 2019.
    Findings
  • Ian Gemp and Sridhar Mahadevan. Global Convergence to the Equilibrium of GANs using Variational Inequalities. In Arxiv:1808.01531, 2018.
    Findings
  • Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.
    Google ScholarLocate open access versionFindings
  • Sven Gowal, Chongli Qin, Po-Sen Huang, Taylan Cemgil, Krishnamurthy Dvijotham, Timothy Mann, and Pushmeet Kohli. Achieving robustness in the wild via adversarial mixing with disentangled representations. arXiv preprint arXiv:1912.03192, 2019.
    Findings
  • Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville. Improved training of wasserstein gans. In Advances in neural information processing systems, pages 5767–5777, 2017.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
    Google ScholarLocate open access versionFindings
  • Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019.
    Google ScholarLocate open access versionFindings
  • Diederick P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015.
    Google ScholarLocate open access versionFindings
  • GM Korpelevich. The extragradient method for finding saddle points and other problems. Matecon, 12: 747–756, 1976.
    Google ScholarLocate open access versionFindings
  • Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
    Google ScholarFindings
  • Lars Mescheder, Sebastian Nowozin, and Andreas Geiger. The numerics of gans. In Advances in Neural Information Processing Systems, pages 1825–1835, 2017.
    Google ScholarLocate open access versionFindings
  • Lars Mescheder, Andreas Geiger, and Sebastian Nowozin. Which training methods for gans do actually converge? In International Conference on Machine Learning, pages 3478–3487, 2018.
    Google ScholarLocate open access versionFindings
  • Takeru Miyato and Masanori Koyama. cGANs with projection discriminator. arXiv preprint arXiv:1802.05637, 2018.
    Findings
  • Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018.
    Findings
  • Vaishnavh Nagarajan and J Zico Kolter. Gradient descent gan optimization is locally stable. In Advances in neural information processing systems, pages 5585–5595, 2017.
    Google ScholarLocate open access versionFindings
  • Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
    Findings
  • Lillian J Ratliff, Samuel A Burden, and S Shankar Sastry. On the characterization of local nash equilibria in continuous games. IEEE Transactions on Automatic Control, 61(8):2301–2307, 2016.
    Google ScholarLocate open access versionFindings
  • Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. In Advances in neural information processing systems, pages 2234–2242, 2016.
    Google ScholarLocate open access versionFindings
  • Satinder P Singh, Michael J Kearns, and Yishay Mansour. Nash convergence of gradient dynamics in general-sum games. In UAI, pages 541–548, 2000.
    Google ScholarLocate open access versionFindings
  • Dávid Terjék. Adversarial lipschitz regularization. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Bke_DertPB.
    Locate open access versionFindings
  • Chuang Wang, Hong Hu, and Yue Lu. A solvable high-dimensional model of gan. In Advances in Neural Information Processing Systems, pages 13759–13768, 2019.
    Google ScholarLocate open access versionFindings
  • Yuanhao Wang, Guodong Zhang, and Jimmy Ba. On solving minimax optimization locally: A followthe-ridge approach. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Hkx7_1rKwS.
    Locate open access versionFindings
  • Christa Zoufal, Aurélien Lucchi, and Stefan Woerner. Quantum generative adversarial networks for learning and loading random distributions. npj Quantum Information, 5(1):1–9, 2019.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments