MineGAN: effective knowledge transfer from GANs to target domains with few images

CVPR, pp. 9329-9338, 2019.

Cited by: 3|Bibtex|Views29|Links
EI
Keywords:
alexei a efrosBatch Statistics Adaptationconditional GANshigh qualityFrechet Inception DistanceMore(16+)
Weibo:
It is based on a mining operation that identifies the regions on the learned Generative adversarial networks manifold that are closer to a given target domain

Abstract:

One of the attractive characteristics of deep neural networks is their ability to transfer knowledge obtained in one domain to other related domains. As a result, high-quality networks can be trained in domains with relatively little training data. This property has been extensively studied for discriminative networks but has received s...More

Code:

Data:

0
Introduction
  • Generative adversarial networks (GANs) can learn the complex underlying distribution of image collections [10].
  • They have been shown to generate high-quality realistic images [14, 15, 4] and are used in many applications including image manipulation [13, 41], style transfer [9], compression [33], and colorization [38].
  • The focus of this paper is on performing these operations using only a small target set of images, and without access to the large datasets used to pretrain the models
Highlights
  • Generative adversarial networks (GANs) can learn the complex underlying distribution of image collections [10]
  • We address knowledge transfer by adapting a trained generative model for targeted image generation given a small sample of the target distribution
  • We introduce a novel miner network to steer the sampling of the latent distribution of a pretrained Generative adversarial networks to a target distribution determined by few images
  • Frechet Inception Distance measures the similarity between two sets in the embedding space given by the features of a convolutional neural network
  • We presented a model for knowledge transfer for generative models
  • It is based on a mining operation that identifies the regions on the learned Generative adversarial networks manifold that are closer to a given target domain
Methods
  • From scratch TransferGAN [35] VAE [18] BSA [26].
  • Generate cars and buses of a variety of different colors.
  • The authors collect a target dataset of 200 images of red vehicles, which contains both red cars and red buses.
  • The authors consider three target sets with different carbus ratios (0.3:0.7, 0.5:0.5, and 0.7:0.3) which allows them to evaluate the estimated probabilities pi of the selector.
  • Scratch TransferGAN TransferGAN TransferGAN TransferGAN MineGAN (w/o FT) MineGAN → Red vehicle
Results
  • Evaluation measures

    The authors employ the widely used Frechet Inception Distance (FID) [12] for evaluation.
  • FID measures the similarity between two sets in the embedding space given by the features of a convolutional neural network.
  • FID measures both the quality and diversity of the generated images and has been shown to correlate well with human perception [12].
  • It suffers from instability on small datasets.
  • Low KMMD values indicate high quality images, while high values of MV indicate more image diversity
Conclusion
  • The authors presented a model for knowledge transfer for generative models.
  • It is based on a mining operation that identifies the regions on the learned GAN manifold that are closer to a given target domain.
  • Mining leads to more effective and efficient fine tuning, even with few target domain images.
  • The authors demonstrated that MineGAN can be used to transfer knowledge from multiple domains
Summary
  • Introduction:

    Generative adversarial networks (GANs) can learn the complex underlying distribution of image collections [10].
  • They have been shown to generate high-quality realistic images [14, 15, 4] and are used in many applications including image manipulation [13, 41], style transfer [9], compression [33], and colorization [38].
  • The focus of this paper is on performing these operations using only a small target set of images, and without access to the large datasets used to pretrain the models
  • Methods:

    From scratch TransferGAN [35] VAE [18] BSA [26].
  • Generate cars and buses of a variety of different colors.
  • The authors collect a target dataset of 200 images of red vehicles, which contains both red cars and red buses.
  • The authors consider three target sets with different carbus ratios (0.3:0.7, 0.5:0.5, and 0.7:0.3) which allows them to evaluate the estimated probabilities pi of the selector.
  • Scratch TransferGAN TransferGAN TransferGAN TransferGAN MineGAN (w/o FT) MineGAN → Red vehicle
  • Results:

    Evaluation measures

    The authors employ the widely used Frechet Inception Distance (FID) [12] for evaluation.
  • FID measures the similarity between two sets in the embedding space given by the features of a convolutional neural network.
  • FID measures both the quality and diversity of the generated images and has been shown to correlate well with human perception [12].
  • It suffers from instability on small datasets.
  • Low KMMD values indicate high quality images, while high values of MV indicate more image diversity
  • Conclusion:

    The authors presented a model for knowledge transfer for generative models.
  • It is based on a mining operation that identifies the regions on the learned GAN manifold that are closer to a given target domain.
  • Mining leads to more effective and efficient fine tuning, even with few target domain images.
  • The authors demonstrated that MineGAN can be used to transfer knowledge from multiple domains
Tables
  • Table1: Results for {Car, Bus} → Red vehicles with three different target data distributions (ratios cars:buses are 0.3:0.7, 0.5:0.5 and 0.7:0.3) and {Living room, Bridge, Church, Kitchen} → Tower/Bedroom. (Top) FID scores between real and generated samples. (Bottom) Estimated probabilities pi for each model. top) shows that our method obtains significantly better FID scores even when we choose the most relevant pretrained GAN to initialize training for TransferGAN. bottom) shows that the miner identifies the relevant pretrained models, e.g. transferring knowledge from Bridge and Church for the target domain Tower. Finally, Fig. 7 (right) provides visual examples
  • Table2: Distance between real data and generated samples as measured by FID score and KMMD value. The off-manifold results correspond to ImageNet → Places365, and the on-manifold results correspond to ImageNet → ImageNet. We also indicate whether the method requires the target label. Finally, we show the inference time for the various methods in milliseconds
  • Table3: Estimated probabilities pi for {Car, Bus} → Red vehicles for MineGAN with mean or max in Eqs. (5) and (6). The actual data distribution is 0.3:0.7 (ratio cars:buses)
  • Table4: Quantitative results of mining on MNIST, expressed as FID / classifier error
Download tables as Excel
Related work
  • Generative adversarial networks. GANs consists of two modules: generator and discriminator [10]. The generator aims to generate images to fool the discriminator, while the discriminator aims to distinguish generated from real images. Training GANs was initially difficult, due to mode collapse and training instability. Several methods focus on addressing these problems [11, 31, 21, 3, 22], while another major line of research aims to improve the architectures to generate higher quality images [29, 6, 14, 16, 4]. For example, Progressive GAN [14] generates better images by synthesizing them progressively from low to highresolution. Finally, BigGAN [4] successfully performs conditional high-realistic generation from ImageNet [5].
Funding
  • We acknowledge the support from Huawei Kirin Solution, the Spanish projects TIN201679717-R and RTI2018-102285-A-I0, the CERCA Program of the Generalitat de Catalunya, and the EU Marie Sklodowska-Curie grant agreement No.6655919
Reference
  • Anonymous, Danbooru community, Gwern Branwen, and Aaron Gokaslan. Danbooru2018: A large-scale crowdsourced and tagged anime illustration dataset. https://www.gwern.net/Danbooru2018, 2019.
    Findings
  • Martın Arjovsky and Leon Bottou. Towards principled methods for training generative adversarial networks. ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Martin Arjovsky, Soumith Chintala, and Leon Bottou. Wasserstein gan. ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high fidelity natural image synthesis. In ICLR, 2019.
    Google ScholarLocate open access versionFindings
  • Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248–255.
    Google ScholarLocate open access versionFindings
  • Emily L Denton, Soumith Chintala, Rob Fergus, et al. Deep generative image models using a laplacian pyramid of adversarial networks. In NeurIPS, pages 1486–1494, 2015.
    Google ScholarLocate open access versionFindings
  • Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. In ICML, pages 647–655, 2014.
    Google ScholarLocate open access versionFindings
  • Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. A learned representation for artistic style. In ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Leon A Gatys, Alexander S Ecker, and Matthias Bethge. Image style transfer using convolutional neural networks. In CVPR, pages 2414–2423, 2016.
    Google ScholarLocate open access versionFindings
  • Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In NeurIPS, pages 2672–2680, 2014.
    Google ScholarLocate open access versionFindings
  • Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville. Improved training of wasserstein gans. In NeurIPS, pages 5767–5777, 2017.
    Google ScholarLocate open access versionFindings
  • Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In NeurIPS, pages 6626–6637, 2017.
    Google ScholarLocate open access versionFindings
  • Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adversarial networks. In CVPR, pages 1125–1134, 2017.
    Google ScholarLocate open access versionFindings
  • Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. CVPR, 2019.
    Google ScholarLocate open access versionFindings
  • Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In CVPR, pages 4401–4410, 2019.
    Google ScholarLocate open access versionFindings
  • Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. ICLR, 2014.
    Google ScholarLocate open access versionFindings
  • Diederik P Kingma and Max Welling. Auto-encoding variational bayes. ICLR, 2014.
    Google ScholarLocate open access versionFindings
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In NeurIPS, pages 1097–1105, 2012.
    Google ScholarLocate open access versionFindings
  • Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In ICCV, pages 3730–3738, 2015.
    Google ScholarLocate open access versionFindings
  • Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, and Stephen Paul Smolley. Least squares generative adversarial networks. In ICCV, pages 2794–2802, 2017.
    Google ScholarLocate open access versionFindings
  • Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. Spectral normalization for generative adversarial networks. In ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • Takeru Miyato and Masanori Koyama. cgans with projection discriminator. ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • Anh Nguyen, Jeff Clune, Yoshua Bengio, Alexey Dosovitskiy, and Jason Yosinski. Plug & play generative networks: Conditional iterative generation of images in latent space. In CVPR, pages 4467–4477, 2017.
    Google ScholarLocate open access versionFindings
  • Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, and Jeff Clune. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. In NeurIPS, pages 3387–3395, 2016.
    Google ScholarLocate open access versionFindings
  • Atsuhiro Noguchi and Tatsuya Harada. Image generation from small datasets via batch statistics adaptation. ICCV, 2019.
    Google ScholarLocate open access versionFindings
  • Maxime Oquab, Leon Bottou, Ivan Laptev, and Josef Sivic. Learning and transferring mid-level image representations using convolutional neural networks. In CVPR, pages 1717– 1724. IEEE, 2014.
    Google ScholarLocate open access versionFindings
  • Sinno Jialin Pan and Qiang Yang. A survey on transfer learning. TKDE, 22(10):1345–1359, 2010.
    Google ScholarLocate open access versionFindings
  • Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In ICLR, 2016.
    Google ScholarLocate open access versionFindings
  • Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. Imagenet large scale visual recognition challenge. IJCV, 115(3):211–252, 2015.
    Google ScholarLocate open access versionFindings
  • Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. In NeurIPS, pages 2234–2242, 2016.
    Google ScholarLocate open access versionFindings
  • Konstantin Shmelkov, Cordelia Schmid, and Karteek Alahari. How good is my gan? In ECCV, pages 213–229, 2018.
    Google ScholarLocate open access versionFindings
  • Michael Tschannen, Eirikur Agustsson, and Mario Lucic. Deep generative models for distribution-preserving lossy compression. In NeurIPS, pages 5929–5940, 2018.
    Google ScholarLocate open access versionFindings
  • Eric Tzeng, Judy Hoffman, Trevor Darrell, and Kate Saenko. Simultaneous deep transfer across domains and tasks. In CVPR, pages 4068–4076, 2015.
    Google ScholarLocate open access versionFindings
  • Yaxing Wang, Chenshen Wu, Luis Herranz, Joost van de Weijer, Abel Gonzalez-Garcia, and Bogdan Raducanu. Transferring gans: generating images from limited data. In ECCV, pages 218–234, 2018.
    Google ScholarLocate open access versionFindings
  • Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.
    Findings
  • Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. Self-attention generative adversarial networks. ICML, 2018.
    Google ScholarLocate open access versionFindings
  • Richard Zhang, Phillip Isola, and Alexei A Efros. Colorful image colorization. In ECCV, pages 649–666.
    Google ScholarLocate open access versionFindings
  • Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. Object detectors emerge in deep scene cnns. ICLR, 2014.
    Google ScholarFindings
  • Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. Learning deep features for scene recognition using places database. In NeurIPS, pages 487– 495, 2014.
    Google ScholarLocate open access versionFindings
  • Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycleconsistent adversarial networks. In ICCV, pages 2223–2232, 2017.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments