Good Semi-supervised Learning that Requires a Bad GAN


Cited by: 289|Views653
Our proposed methods consistently improve the performance upon feature matching


Semi-supervised learning methods based on generative adversarial networks (GANs) obtained strong empirical results, but it is not clear 1) how the discriminator benefits from joint training with a generator, and 2) why good semi-supervised classification performance and a good generator cannot be obtained at the same time. Theoretically w...More



Full Text
  • Deep neural networks are usually trained on a large amount of labeled data, and it has been a challenge to apply deep models to datasets with limited labels.
  • Under mild assumptions, we show that a properly optimized discriminator obtains correct decision boundaries in high-density areas in the feature space if the generator is a complement generator.
  • Our approach substantially improves over vanilla feature matching GANs, and obtains new state-of-the-art results on MNIST, SVHN, and CIFAR-10 when all methods are compared under the same discriminator architecture.
  • The first term is to maximize the log conditional probability for labeled data, which is the standard cost as in supervised learning setting.
  • Proposition 2 guarantees that when G is a complement generator, under mild assumptions, a nearoptimal D learns correct decision boundaries in each high-density subset Fk of the data support in the feature space.
  • The (K + 1)class formulation is effective because the generated complement samples encourage the discriminator to place the class boundaries in low-density areas (Proposition 2).
  • As discussed in previous sections, feature matching GANs suffer from the following drawbacks: 1) the first-order moment matching objective does not prevent the generator from collapsing; 2) feature matching can generate high-density samples inside manifold; 3) the discriminator objective does not encourage realization of condition (3) in Assumption 1 as discussed in Section 3.2.
  • The second method aims at increasing the generator entropy in the feature space by optimizing an auxiliary objective.
  • The second drawback of feature matching GANs is that high-density samples can be generated in the feature space, which is not desirable according to our analysis.
  • The feature matching term in Eq (4) can be seen as softly enforcing this constraint by bringing generated samples “close” to the true data (Cf. Section 4).
  • Optimizing our proposed objective (4) can be understood as minimizing the KL divergence between the generator distribution and a desired complement distribution, which connects our practical solution to our theoretical analysis.
  • In order for the complement generator to work, according to condition (3) in Assumption 1, the discriminator needs to have strong true-fake belief on unlabeled data, i.e., maxKk=1 wk f (x) > 0.
  • To guarantee the strong true-fake belief in the optimal conditions, we add a conditional entropy term to the discriminator objective and it becomes, max Ex,y∼L log pD(y|x, y ≤ K) + Ex∼U log pD(y ≤ K|x)+ D
  • We present a semi-supervised learning framework that uses generated data to boost task performance.
  • Our proposed method improves the performance of image classification on several benchmark datasets
  • The same phenomenon was also observed in [21], where the model generated better images but failed to improve the performance on semi-supervised learning
  • Our proposed methods consistently improve the performance upon feature matching
  • We achieve new state-of-the-art results on all the datasets when only small discriminator architecture is considered
  • Our proposed method improves the performance of image classification on several benchmark datasets
Your rating :