Semi-Supervised Learning with Generative Adversarial Networks

arXiv: Machine Learning, Volume abs/1606.01583, 2016.

Cited by: 1|Bibtex|Views146|Links
EI
Keywords:
REALFAKEgenerative modelZEROsemi supervisedMore(4+)
Weibo:
The experiments in this paper were conducted with https://github.com/DoctorTeeth/supergan, which borrows heavily from https://github.com/carpedm20/DCGANtensorflow and which contains more details about the experimental setup

Abstract:

We extend Generative Adversarial Networks (GANs) to the semi-supervised context by forcing the discriminator network to output class labels. We train a generative model G and a discriminator D on a dataset with inputs belonging to one of N classes. At training time, D is made to predict which of N+1 classes the input belongs to, where an ...More

Code:

Data:

0
Introduction
  • Work on generating images with Generative Adversarial Networks (GANs) has shown promising results (Goodfellow et al, 2014).
  • G is trained to maximize the probability that D makes a mistake, and D is trained to minimize that probability
  • Building on these ideas, one can generate good output samples using a cascade (Denton et al, 2015) of convolutional neural networks.
  • Using generative models on semi-supervised learning tasks is not a new idea - Kingma et al (2014) expand work on variational generative techniques (Kingma & Welling, 2013; Rezende et al, 2014) to do just that.
  • The CatGAN (Springenberg, 2015) modifies the objective function to take into account mutual information between observed examples and their
Highlights
  • Work on generating images with Generative Adversarial Networks (GANs) has shown promising results (Goodfellow et al, 2014)
  • We describe a novel extension to Generative Adversarial Networks that allows them to learn a generative model and a classifier simultaneously
  • We demonstrate that SGAN can significantly improve the quality of the generated samples and reduce training times for the generator
  • The experiments in this paper were conducted with https://github.com/DoctorTeeth/supergan, which borrows heavily from https://github.com/carpedm20/DCGANtensorflow and which contains more details about the experimental setup
  • We conducted experiments on MNIST to see whether the classifier component of the SGAN would perform better than an isolated classifier on restricted training sets
  • Note that the second configuration is semantically identical to a normal Generative Adversarial Networks
Results
  • The experiments in this paper were conducted with https://github.com/DoctorTeeth/supergan, which borrows heavily from https://github.com/carpedm20/DCGANtensorflow and which contains more details about the experimental setup.
  • The authors conducted experiments on MNIST to see whether the classifier component of the SGAN would perform better than an isolated classifier on restricted training sets.
  • The authors ran experiments on the MNIST dataset (LeCun et al, 1998) to determine whether an SGAN would result in better generative samples than a regular GAN.
  • The SGAN outputs are significantly more clear than the GAN outputs
  • This seemed to hold true across different initializations and network architectures, but it is
Conclusion
  • Conclusion and Future

    Work

    The authors are excited to explore the following related ideas:

    Share some of the weights between D and C, as in the dual autoencoder (Sutskever et al, 2015).
  • Introduce a ladder network (Rasmus et al, 2015) L in place of D/C, use samples from G as unlabeled data to train L with
Summary
  • Work on generating images with Generative Adversarial Networks (GANs) has shown promising results (Goodfellow et al, 2014).
  • One can generate good output samples using a cascade (Denton et al, 2015) of convolutional neural networks.
  • More recently (Radford et al, 2015), even better samples were created from a single generator network.
  • We consider the situation where we try to solve a semi-supervised classification task and learn a generative model simultaneously.
  • We may learn a generative model for MNIST images while we train an image classifier, which we’ll call C.
  • The fact that representations learned by D help improve C is not surprising - it seems reasonable that this should work.
  • Using the learned representations of D after the fact doesn’t allow for training C and G simultaneously.
  • We describe a novel extension to GANs that allows them to learn a generative model and a classifier simultaneously.
  • We show that SGAN improves classification performance on restricted data sets over a baseline classifier with no generative component.
  • We demonstrate that SGAN can significantly improve the quality of the generated samples and reduce training times for the generator.
  • The discriminator network D in a normal GAN outputs an estimated probability that the input image is drawn from the data generating distribution.
  • We use higher granularity labels for the half of the minibatch that has been drawn from the data generating distribution.
  • We conducted experiments on MNIST to see whether the classifier component of the SGAN would perform better than an isolated classifier on restricted training sets.
  • SGAN outperforms the baseline in proportion to how much we shrink the training set, suggesting that forcing D and C to share weights improves data-efficiency.
  • We ran experiments on the MNIST dataset (LeCun et al, 1998) to determine whether an SGAN would result in better generative samples than a regular GAN.
  • Using an architecture similar to that in Radford et al (2015), we trained an SGAN both using the actual MNIST labels and with only the labels REAL and FAKE.
  • Figure 1 contains examples of generative outputs from both GAN and SGAN.
  • Make GAN generate examples with class labels (Mirza & Osindero, 2014).
  • Introduce a ladder network (Rasmus et al, 2015) L in place of D/C, use samples from G as unlabeled data to train L with.
Tables
  • Table1: Classifier Accuracy
Download tables as Excel
Funding
  • Extends Generative Adversarial Networks to the semi-supervised context by forcing the discriminator network to output class labels
  • D is made to predict which of N+1 classes the input belongs to, where an extra class is added to correspond to the outputs of G. shows that this method can be used to create a more data-efficient classifier and that it allows for generating higher quality samples than a regular GAN
  • Makes the following contributions: First, describes a novel extension to GANs that allows them to learn a generative model and a classifier simultaneously
Reference
  • Abadi, Martın, Agarwal, Ashish, Barham, Paul, Brevdo, Eugene, Chen, Zhifeng, Citro, Craig, Corrado, Gregory S., Davis, Andy, Dean, Jeffrey, Devin, Matthieu, Ghemawat, Sanjay, Goodfellow, Ian J., Harp, Andrew, Irving, Geoffrey, Isard, Michael, Jia, Yangqing, Jozefowicz, Rafal, Kaiser, Lukasz, Kudlur, Manjunath, Levenberg, Josh, Mane, Dan, Monga, Rajat, Moore, Sherry, Murray, Derek Gordon, Olah, Chris, Schuster, Mike, Shlens, Jonathon, Steiner, Benoit, Sutskever, Ilya, Talwar, Kunal, Tucker, Paul A., Vanhoucke, Vincent, Vasudevan, Vijay, Viegas, Fernanda B., Vinyals, Oriol, Warden, Pete, Wattenberg, Martin, Wicke, Martin, Yu, Yuan, and Zheng, Xiaoqiang. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. CoRR, abs/1603.04467, 2016. URL http://arxiv.org/abs/1603.04467.
    Findings
  • Denton, Emily L., Chintala, Soumith, Szlam, Arthur, and Fergus, Robert. Deep generative image models using a laplacian pyramid of adversarial networks. CoRR, abs/1506.05751, 2015. URL http://arxiv.org/abs/1506.05751.
    Findings
  • Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative Adversarial Networks. ArXiv e-prints, June 2014.
    Google ScholarLocate open access versionFindings
  • Kingma, D. P and Welling, M. Auto-Encoding Variational Bayes. ArXiv e-prints, December 2013.
    Google ScholarFindings
  • Kingma, Diederik P., Rezende, Danilo Jimenez, Mohamed, Shakir, and Welling, Max. Semi-supervised learning with deep generative models. CoRR, abs/1406.5298, 2014. URL http://arxiv.org/abs/1406.5298.
    Findings
  • LeCun, Yann, Bottou, Leon, Bengio, Yoshua, and Haffner, Patrick. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.
    Google ScholarLocate open access versionFindings
  • Mirza, Mehdi and Osindero, Simon. Conditional generative adversarial nets. CoRR, abs/1411.1784, 2014. URL http://arxiv.org/abs/1411.1784.
    Findings
  • Radford, Alec, Metz, Luke, and Chintala, Soumith. Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR, abs/1511.06434, 2015. URL http://arxiv.org/abs/1511.06434.
    Findings
  • Rasmus, Antti, Valpola, Harri, Honkala, Mikko, Berglund, Mathias, and Raiko, Tapani. Semi-supervised learning with ladder network. CoRR, abs/1507.02672, 2015. URL http://arxiv.org/abs/1507.02672.
    Findings
  • Rezende, D., Mohamed, S., and Wierstra, D. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. ArXiv e-prints, January 2014.
    Google ScholarFindings
  • Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., and Chen, X. Improved Techniques for Training GANs. ArXiv e-prints, June 2016.
    Google ScholarFindings
  • Springenberg, J. T. Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks. ArXiv e-prints, November 2015.
    Google ScholarFindings
  • Sutskever, I., Jozefowicz, R., Gregor, K., Rezende, D., Lillicrap, T., and Vinyals, O. Towards Principled Unsupervised Learning. ArXiv e-prints, November 2015.
    Google ScholarFindings
Your rating :
0

 

Tags
Comments