Improving GANs Using Optimal Transport

ICLR, Volume abs/1803.05573, 2018.

Cited by: 100|Bibtex|Views39|Links
EI
Keywords:
datum distributionlarge mini batchot gangenerative modelingunbiased mini batch gradientMore(11+)
Weibo:
Optimal Transport Generative adversarial nets was shown to be uniquely stable when trained with large mini-batches and to achieve state-of-the-art results on several common benchmarks

Abstract:

We present Optimal Transport GAN (OT-GAN), a variant of generative adversarial nets minimizing a new metric measuring the distance between the generator distribution and the data distribution. This metric, which we call mini-batch energy distance, combines optimal transport in primal form with an energy distance defined in an adversariall...More

Code:

Data:

0
Introduction
  • Generative modeling is a major sub-field of Machine Learning that studies the problem of how to learn models that generate images, audio, video, text or other data.
  • Applications of generative models include image compression, generating speech from text, planning in reinforcement learning, semi-supervised and unsupervised representation learning, and many others.
  • The critic defines a distance between the model distribution and the data distribution which the generative model can optimize to produce data that more closely resembles the training data
Highlights
  • Generative modeling is a major sub-field of Machine Learning that studies the problem of how to learn models that generate images, audio, video, text or other data
  • In this paper we present Optimal Transport Generative adversarial nets, a variant of generative adversarial nets incorporating primal form optimal transport into its critic
  • The samples generated by Optimal Transport Generative adversarial nets contain less nonsensical images, and the sample quality is significantly better than that of a tuned DCGAN variant which still suffers from mode collapse
  • We have presented Optimal Transport Generative adversarial nets, a new variant of Generative adversarial nets where the generator is trained to minimize a novel distance metric over probability distributions
  • Optimal Transport Generative adversarial nets was shown to be uniquely stable when trained with large mini-batches and to achieve state-of-the-art results on several common benchmarks
Methods
  • Method Real Data DCGAN Improved GAN

    Denoising FM WGAN-GP OT-GAN

    Inception score 11.95 ± .12 6.16 ± .07 6.86 ± .06 7.72 ± .13 7.86 ± .07 8.47 ± .12 Inception Score

    BBBBaaaattttcccchhhh SSSSiiiizzzzeeee 28380020000000

    1000Numbe2r 0of0T0rainig E3p0o0c0hs 4000

    5.3 IMAGENET DOGS

    To illustrate the ability of OT-GAN in generating high quality images on more complex data sets, the authors train OT-GAN to generate 128×128 images on the dog subset of ImageNet (Russakovsky et al, 2015).
  • The superior image quality is confirmed by the inception score achieved by OT-GAN(8.97±0.09) on this dataset, which outperforms that of DCGAN(8.19±0.11).
  • To further demonstrate the effectiveness of the proposed method on conditional image synthesis, the authors compare OT-GAN with state-of-the-art models on text-to-image generation (Reed et al, 2016b;a; Zhang et al, 2017).
  • As shown in Table 2, the images generated by OT-GAN with batch size 2048 achieve the best inception score here.
  • Example images generated by the conditional generative model on the CUB test set are presented in Figure 7.
  • Inception Score 2.88 ± .04 3.62 ± .07 3.70 ± .04 3.84 ± .05
Conclusion
  • The authors have presented OT-GAN, a new variant of GANs where the generator is trained to minimize a novel distance metric over probability distributions.
  • This metric, which the authors call mini-batch energy distance, combines optimal transport in primal form with an energy distance defined in an adversarially learned feature space, resulting in a highly discriminative distance function with unbiased mini-batch gradients.
  • In future work the authors hope to make the method more computationally efficient, as well as to scale up the approach to multi-machine training to enable generation of even more challenging and high resolution image data sets
Summary
  • Introduction:

    Generative modeling is a major sub-field of Machine Learning that studies the problem of how to learn models that generate images, audio, video, text or other data.
  • Applications of generative models include image compression, generating speech from text, planning in reinforcement learning, semi-supervised and unsupervised representation learning, and many others.
  • The critic defines a distance between the model distribution and the data distribution which the generative model can optimize to produce data that more closely resembles the training data
  • Methods:

    Method Real Data DCGAN Improved GAN

    Denoising FM WGAN-GP OT-GAN

    Inception score 11.95 ± .12 6.16 ± .07 6.86 ± .06 7.72 ± .13 7.86 ± .07 8.47 ± .12 Inception Score

    BBBBaaaattttcccchhhh SSSSiiiizzzzeeee 28380020000000

    1000Numbe2r 0of0T0rainig E3p0o0c0hs 4000

    5.3 IMAGENET DOGS

    To illustrate the ability of OT-GAN in generating high quality images on more complex data sets, the authors train OT-GAN to generate 128×128 images on the dog subset of ImageNet (Russakovsky et al, 2015).
  • The superior image quality is confirmed by the inception score achieved by OT-GAN(8.97±0.09) on this dataset, which outperforms that of DCGAN(8.19±0.11).
  • To further demonstrate the effectiveness of the proposed method on conditional image synthesis, the authors compare OT-GAN with state-of-the-art models on text-to-image generation (Reed et al, 2016b;a; Zhang et al, 2017).
  • As shown in Table 2, the images generated by OT-GAN with batch size 2048 achieve the best inception score here.
  • Example images generated by the conditional generative model on the CUB test set are presented in Figure 7.
  • Inception Score 2.88 ± .04 3.62 ± .07 3.70 ± .04 3.84 ± .05
  • Conclusion:

    The authors have presented OT-GAN, a new variant of GANs where the generator is trained to minimize a novel distance metric over probability distributions.
  • This metric, which the authors call mini-batch energy distance, combines optimal transport in primal form with an energy distance defined in an adversarially learned feature space, resulting in a highly discriminative distance function with unbiased mini-batch gradients.
  • In future work the authors hope to make the method more computationally efficient, as well as to scale up the approach to multi-machine training to enable generation of even more challenging and high resolution image data sets
Tables
  • Table1: Inception scores on CIFAR-10. All the models are trained in an unsupervised manner
  • Table2: Inception scores by state-of-the-art methods (Reed et al, 2016b;a; Zhang et al, 2017) and the proposed OT-GAN on the CUB test set. Higher inception scores mean better image quality
  • Table3: Generator architecture for CIFAR-10
  • Table4: Critic architecture for CIFAR-10
Download tables as Excel
Funding
  • Presents Optimal Transport GAN , a variant of generative adversarial nets minimizing a new metric measuring the distance between the generator distribution and the data distribution
  • Presents OT-GAN, a variant of generative adversarial nets incorporating primal form optimal transport into its critic
  • Provides the preliminaries required to understand our work, and put our contribution into context by discussing the relevant literature
Reference
  • Martin Arjovsky, Soumith Chintala, and Leon Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
    Findings
  • Marc G Bellemare, Ivo Danihelka, Will Dabney, Shakir Mohamed, Balaji Lakshminarayanan, Stephan Hoyer, and Remi Munos. The cramer distance as a solution to biased wasserstein gradients. arXiv preprint arXiv:1705.10743, 2017.
    Findings
  • Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Carl-Johann Simon-Gabriel, and Bernhard Schoelkopf. From optimal transport to generative modeling: the vegan cookbook. arXiv preprint arXiv:1705.07642, 2017.
    Findings
  • Michael Carter. Foundations of mathematical economics. MIT Press, 2001.
    Google ScholarFindings
  • Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pp. 2292–2300, 2013.
    Google ScholarLocate open access versionFindings
  • Yann N Dauphin, Angela Fan, Michael Auli, and David Grangier. Language modeling with gated convolutional networks. arXiv preprint arXiv:1612.08083, 2016.
    Findings
  • Aude Genevay, Gabriel Peyre, and Marco Cuturi. Gan and vae from an optimal transport point of view. arXiv preprint arXiv:1706.01807, 2017a.
    Findings
  • Aude Genevay, Gabriel Peyre, and Marco Cuturi. Sinkhorn-autodiff: Tractable wasserstein learning of generative models. arXiv preprint arXiv:1706.00292, 2017b.
    Findings
  • Aude Genevay, Gabriel Peyre, and Marco Cuturi. Learning generative models with sinkhorn divergences. AISTATS Proceedings, 2018.
    Google ScholarLocate open access versionFindings
  • Aidan N Gomez, Mengye Ren, Raquel Urtasun, and Roger B Grosse. The reversible residual network: Backpropagation without storing activations. In Advances in Neural Information Processing Systems, pp. 2211– 2221, 2017.
    Google ScholarLocate open access versionFindings
  • Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680, 2014.
    Google ScholarLocate open access versionFindings
  • Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. Improved training of wasserstein gans. arXiv preprint arXiv:1704.00028, 2017.
    Findings
  • Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • Lev Borisovich Klebanov, Viktor Benes, and Ivan Saxl. N-distances and their applications. Charles University in Prague, the Karolinum Press, 2005.
    Google ScholarFindings
  • Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009. Chengtao Li, David Alvarez-Melis, Keyulu Xu, Stefanie Jegelka, and Suvrit Sra. Distributional adversarial networks. arXiv:1706.09549, 2017. Luke Metz, Ben Poole, David Pfau, and Jascha Sohl-Dickstein. Unrolled generative adversarial networks. In
    Findings
  • ICLR, 2017. Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015. Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, and Honglak Lee. Learning what and where to draw. In NIPS, 2016a. Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. Generative adversarial text-to-image synthesis. In ICML, 2016b. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej
    Findings
  • The generator and critic are implemented as convolutional networks. Their architectures are loosely based on DCGAN with various modifications. Weight normalization and data-dependent initialization (Salimans & Kingma, 2016) are used for both. The generator maps latent codes sampled from a
    Google ScholarFindings
  • 5 × 5 kernel using gated linear units (Dauphin et al., 2016). The main module of the critic is a convolution with a 5 × 5 kernel and stride 2 using the concatenated ReLU activation function (Shang et al., 2016). Notably, the generator and critic do not use an activation normalization technique such as batch or layer normalization. We train the model using Adam with a learning rate of 3 × 10−4, β1 = 0.5, β2 = 0.999. We update the generator 3 times for every critic update. OT-GAN includes two additional hyperparameters for the Sinkhorn algorithm, the number of iterations to run the algorithm and
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments