Semi-supervised learning with deep generative models

Cited by: 1706|Bibtex|Views194|
Keywords:
contractive auto-encodersmodern datumrecent advancegenerative modelbayesian inferenceMore(13+)
Weibo:
We have developed new models for semi-supervised learning that allow us to improve the quality of prediction by exploiting information in the data density using generative models

Abstract:

The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effecti...More

Code:

Data:

0
Introduction
  • Semi-supervised learning considers the problem of classification when only a small subset of the observations have corresponding class labels.
  • The simplest algorithm for semi-supervised learning is based on a self-training scheme (Rosenberg et al, 2005) where the the model is bootstrapped with additional labelled data obtained from its own highly confident predictions; this process being repeated until some termination condition is reached
  • These methods are heuristic and prone to error since they can reinforce poor predictions.
Highlights
  • Semi-supervised learning considers the problem of classification when only a small subset of the observations have corresponding class labels
  • Such problems are of immense practical interest in a wide range of applications, including image search (Fergus et al, 2009), genomics (Shi and Zhang, 2011), natural language parsing (Liang, 2005), and speech analysis (Liu and Kirchhoff, 2013), where unlabelled data is abundant, but obtaining class labels is expensive or impossible to obtain for the entire data set
  • By far the best results were obtained using the stack of models M1 and M2. This combined model provides accurate test-set predictions across all conditions, and outperforms the previously best methods. We tested this deep generative model for supervised learning with all available labels, and obtain a test-set performance of 0.96%, which is among the best published results for this permutation-invariant MNIST classification task
  • Since all the components of our model are parametrised by neural networks we can readily exploit convolutional or more general locally-connected architectures – and forms a promising avenue for future exploration
  • We have developed new models for semi-supervised learning that allow us to improve the quality of prediction by exploiting information in the data density using generative models
  • We have developed an efficient variational optimisation algorithm for approximate Bayesian inference in these models and demonstrated that they are amongst the most competitive models currently available for semisupervised learning
Results
  • With which the most important results and figures can be reproduced, is available at http://github.com/dpkingma/nips14-ssl.
  • This combined model provides accurate test-set predictions across all conditions, and outperforms the previously best methods
  • The authors tested this deep generative model for supervised learning with all available labels, and obtain a test-set performance of 0.96%, which is among the best published results for this permutation-invariant MNIST classification task.
  • The authors see that nearby regions of latent space correspond to similar writing styles, independent of the class; the left region represents upright writing styles, while the right-side represents slanted styles
Conclusion
  • Discussion and Conclusion

    The approximate inference methods introduced here can be extended to the model’s parameters, harnessing the full power of variational learning.
  • The authors have developed an efficient variational optimisation algorithm for approximate Bayesian inference in these models and demonstrated that they are amongst the most competitive models currently available for semisupervised learning.
  • The authors hope that these results stimulate the development of even more powerful semi-supervised classification methods based on generative models, of which there remains much scope
Summary
  • Introduction:

    Semi-supervised learning considers the problem of classification when only a small subset of the observations have corresponding class labels.
  • The simplest algorithm for semi-supervised learning is based on a self-training scheme (Rosenberg et al, 2005) where the the model is bootstrapped with additional labelled data obtained from its own highly confident predictions; this process being repeated until some termination condition is reached
  • These methods are heuristic and prone to error since they can reinforce poor predictions.
  • Results:

    With which the most important results and figures can be reproduced, is available at http://github.com/dpkingma/nips14-ssl.
  • This combined model provides accurate test-set predictions across all conditions, and outperforms the previously best methods
  • The authors tested this deep generative model for supervised learning with all available labels, and obtain a test-set performance of 0.96%, which is among the best published results for this permutation-invariant MNIST classification task.
  • The authors see that nearby regions of latent space correspond to similar writing styles, independent of the class; the left region represents upright writing styles, while the right-side represents slanted styles
  • Conclusion:

    Discussion and Conclusion

    The approximate inference methods introduced here can be extended to the model’s parameters, harnessing the full power of variational learning.
  • The authors have developed an efficient variational optimisation algorithm for approximate Bayesian inference in these models and demonstrated that they are amongst the most competitive models currently available for semisupervised learning.
  • The authors hope that these results stimulate the development of even more powerful semi-supervised classification methods based on generative models, of which there remains much scope
Tables
  • Table1: Benchmark results of semi-supervised classification on MNIST with few labels
  • Table2: Semi-supervised classification on the SVHN dataset with 1000 labels
  • Table3: Semi-supervised classification on the NORB dataset with 1000 labels
Download tables as Excel
Reference
  • Adams, R. P. and Ghahramani, Z. (2009). Archipelago: nonparametric Bayesian semi-supervised learning. In Proceedings of the International Conference on Machine Learning (ICML).
    Google ScholarLocate open access versionFindings
  • Blum, A., Lafferty, J., Rwebangira, M. R., and Reddy, R. (2004). Semi-supervised learning using randomized mincuts. In Proceedings of the International Conference on Machine Learning (ICML).
    Google ScholarLocate open access versionFindings
  • Dayan, P. (2000). Helmholtz machines and wake-sleep learning. Handbook of Brain Theory and Neural Network. MIT Press, Cambridge, MA, 44(0).
    Google ScholarFindings
  • Dietterich, T. G. and Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. arXiv preprint cs/9501101.
    Google ScholarFindings
  • Duchi, J., Hazan, E., and Singer, Y. (2010). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159.
    Google ScholarLocate open access versionFindings
  • Fergus, R., Weiss, Y., and Torralba, A. (2009). Semi-supervised learning in gigantic image collections. In Advances in Neural Information Processing Systems (NIPS).
    Google ScholarLocate open access versionFindings
  • Joachims, T. (1999). Transductive inference for text classification using support vector machines. In Proceeding of the International Conference on Machine Learning (ICML), volume 99, pages 200–209.
    Google ScholarLocate open access versionFindings
  • Kemp, C., Griffiths, T. L., Stromsten, S., and Tenenbaum, J. B. (2003). Semi-supervised learning with trees. In Advances in Neural Information Processing Systems (NIPS).
    Google ScholarLocate open access versionFindings
  • Kingma, D. P. and Welling, M. (2014). Auto-encoding variational Bayes. In Proceedings of the International Conference on Learning Representations (ICLR).
    Google ScholarLocate open access versionFindings
  • Li, P., Ying, Y., and Campbell, C. (2009). A variational approach to semi-supervised clustering. In Proceedings of the European Symposium on Artificial Neural Networks (ESANN), pages 11 – 16.
    Google ScholarLocate open access versionFindings
  • Liang, P. (2005). Semi-supervised learning for natural language. PhD thesis, Massachusetts Institute of Technology.
    Google ScholarFindings
  • Liu, Y. and Kirchhoff, K. (2013). Graph-based semi-supervised learning for phone and segment classification. In Proceedings of Interspeech.
    Google ScholarLocate open access versionFindings
  • Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and Ng, A. Y. (2011). Reading digits in natural images with unsupervised feature learning. In NIPS workshop on deep learning and unsupervised feature learning.
    Google ScholarFindings
  • Pal, C., Sutton, C., and McCallum, A. (2005). Fast inference and learning with sparse belief propagation. In Advances in Neural Information Processing Systems (NIPS).
    Google ScholarLocate open access versionFindings
  • Pitelis, N., Russell, C., and Agapito, L. (2014). Semi-supervised learning using an unsupervised atlas. In Proceddings of the European Conference on Machine Learning (ECML), volume LNCS 8725, pages 565 – 580.
    Google ScholarLocate open access versionFindings
  • Ranzato, M. and Szummer, M. (2008). Semi-supervised learning of compact document representations with deep networks. In Proceedings of the 25th International Conference on Machine Learning (ICML), pages 792–799.
    Google ScholarLocate open access versionFindings
  • Rezende, D. J., Mohamed, S., and Wierstra, D. (2014). Stochastic backpropagation and approximate inference in deep generative models. In Proceedings of the International Conference on Machine Learning (ICML), volume 32 of JMLR W&CP.
    Google ScholarLocate open access versionFindings
  • Rifai, S., Dauphin, Y., Vincent, P., Bengio, Y., and Muller, X. (2011). The manifold tangent classifier. In Advances in Neural Information Processing Systems (NIPS), pages 2294–2302.
    Google ScholarLocate open access versionFindings
  • Rosenberg, C., Hebert, M., and Schneiderman, H. (2005). Semi-supervised self-training of object detection models. In Proceedings of the Seventh IEEE Workshops on Application of Computer Vision (WACV/MOTION’05).
    Google ScholarLocate open access versionFindings
  • Shi, M. and Zhang, B. (2011). Semi-supervised learning improves gene expression-based prediction of cancer recurrence. Bioinformatics, 27(21):3017–3023.
    Google ScholarLocate open access versionFindings
  • Stuhlmuller, A., Taylor, J., and Goodman, N. (2013). Learning stochastic inverses. In Advances in neural information processing systems, pages 3048–3056.
    Google ScholarLocate open access versionFindings
  • Tang, Y. and Salakhutdinov, R. (2013). Learning stochastic feedforward neural networks. In Advances in Neural Information Processing Systems (NIPS), pages 530–538.
    Google ScholarLocate open access versionFindings
  • Wang, Y., Haffari, G., Wang, S., and Mori, G. (2009). A rate distortion approach for semi-supervised conditional random fields. In Advances in Neural Information Processing Systems (NIPS), pages 2008–2016.
    Google ScholarLocate open access versionFindings
  • Weston, J., Ratle, F., Mobahi, H., and Collobert, R. (2012). Deep learning via semi-supervised embedding. In Neural Networks: Tricks of the Trade, pages 639–655. Springer.
    Google ScholarLocate open access versionFindings
  • Zhu, X. (2006). Semi-supervised learning literature survey. Technical report, Computer Science, University of Wisconsin-Madison.
    Google ScholarFindings
  • Zhu, X., Ghahramani, Z., Lafferty, J., et al. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. In Proceddings of the International Conference on Machine Learning (ICML), volume 3, pages 912–919.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments