Semi-supervised learning with deep generative models
Keywords:
Weibo:
Abstract:
The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effecti...More
Code:
Data:
Introduction
- Semi-supervised learning considers the problem of classification when only a small subset of the observations have corresponding class labels.
- The simplest algorithm for semi-supervised learning is based on a self-training scheme (Rosenberg et al, 2005) where the the model is bootstrapped with additional labelled data obtained from its own highly confident predictions; this process being repeated until some termination condition is reached
- These methods are heuristic and prone to error since they can reinforce poor predictions.
Highlights
- Semi-supervised learning considers the problem of classification when only a small subset of the observations have corresponding class labels
- Such problems are of immense practical interest in a wide range of applications, including image search (Fergus et al, 2009), genomics (Shi and Zhang, 2011), natural language parsing (Liang, 2005), and speech analysis (Liu and Kirchhoff, 2013), where unlabelled data is abundant, but obtaining class labels is expensive or impossible to obtain for the entire data set
- By far the best results were obtained using the stack of models M1 and M2. This combined model provides accurate test-set predictions across all conditions, and outperforms the previously best methods. We tested this deep generative model for supervised learning with all available labels, and obtain a test-set performance of 0.96%, which is among the best published results for this permutation-invariant MNIST classification task
- Since all the components of our model are parametrised by neural networks we can readily exploit convolutional or more general locally-connected architectures – and forms a promising avenue for future exploration
- We have developed new models for semi-supervised learning that allow us to improve the quality of prediction by exploiting information in the data density using generative models
- We have developed an efficient variational optimisation algorithm for approximate Bayesian inference in these models and demonstrated that they are amongst the most competitive models currently available for semisupervised learning
Results
- With which the most important results and figures can be reproduced, is available at http://github.com/dpkingma/nips14-ssl.
- This combined model provides accurate test-set predictions across all conditions, and outperforms the previously best methods
- The authors tested this deep generative model for supervised learning with all available labels, and obtain a test-set performance of 0.96%, which is among the best published results for this permutation-invariant MNIST classification task.
- The authors see that nearby regions of latent space correspond to similar writing styles, independent of the class; the left region represents upright writing styles, while the right-side represents slanted styles
Conclusion
- Discussion and Conclusion
The approximate inference methods introduced here can be extended to the model’s parameters, harnessing the full power of variational learning. - The authors have developed an efficient variational optimisation algorithm for approximate Bayesian inference in these models and demonstrated that they are amongst the most competitive models currently available for semisupervised learning.
- The authors hope that these results stimulate the development of even more powerful semi-supervised classification methods based on generative models, of which there remains much scope
Summary
Introduction:
Semi-supervised learning considers the problem of classification when only a small subset of the observations have corresponding class labels.- The simplest algorithm for semi-supervised learning is based on a self-training scheme (Rosenberg et al, 2005) where the the model is bootstrapped with additional labelled data obtained from its own highly confident predictions; this process being repeated until some termination condition is reached
- These methods are heuristic and prone to error since they can reinforce poor predictions.
Results:
With which the most important results and figures can be reproduced, is available at http://github.com/dpkingma/nips14-ssl.- This combined model provides accurate test-set predictions across all conditions, and outperforms the previously best methods
- The authors tested this deep generative model for supervised learning with all available labels, and obtain a test-set performance of 0.96%, which is among the best published results for this permutation-invariant MNIST classification task.
- The authors see that nearby regions of latent space correspond to similar writing styles, independent of the class; the left region represents upright writing styles, while the right-side represents slanted styles
Conclusion:
Discussion and Conclusion
The approximate inference methods introduced here can be extended to the model’s parameters, harnessing the full power of variational learning.- The authors have developed an efficient variational optimisation algorithm for approximate Bayesian inference in these models and demonstrated that they are amongst the most competitive models currently available for semisupervised learning.
- The authors hope that these results stimulate the development of even more powerful semi-supervised classification methods based on generative models, of which there remains much scope
Tables
- Table1: Benchmark results of semi-supervised classification on MNIST with few labels
- Table2: Semi-supervised classification on the SVHN dataset with 1000 labels
- Table3: Semi-supervised classification on the NORB dataset with 1000 labels
Reference
- Adams, R. P. and Ghahramani, Z. (2009). Archipelago: nonparametric Bayesian semi-supervised learning. In Proceedings of the International Conference on Machine Learning (ICML).
- Blum, A., Lafferty, J., Rwebangira, M. R., and Reddy, R. (2004). Semi-supervised learning using randomized mincuts. In Proceedings of the International Conference on Machine Learning (ICML).
- Dayan, P. (2000). Helmholtz machines and wake-sleep learning. Handbook of Brain Theory and Neural Network. MIT Press, Cambridge, MA, 44(0).
- Dietterich, T. G. and Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. arXiv preprint cs/9501101.
- Duchi, J., Hazan, E., and Singer, Y. (2010). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159.
- Fergus, R., Weiss, Y., and Torralba, A. (2009). Semi-supervised learning in gigantic image collections. In Advances in Neural Information Processing Systems (NIPS).
- Joachims, T. (1999). Transductive inference for text classification using support vector machines. In Proceeding of the International Conference on Machine Learning (ICML), volume 99, pages 200–209.
- Kemp, C., Griffiths, T. L., Stromsten, S., and Tenenbaum, J. B. (2003). Semi-supervised learning with trees. In Advances in Neural Information Processing Systems (NIPS).
- Kingma, D. P. and Welling, M. (2014). Auto-encoding variational Bayes. In Proceedings of the International Conference on Learning Representations (ICLR).
- Li, P., Ying, Y., and Campbell, C. (2009). A variational approach to semi-supervised clustering. In Proceedings of the European Symposium on Artificial Neural Networks (ESANN), pages 11 – 16.
- Liang, P. (2005). Semi-supervised learning for natural language. PhD thesis, Massachusetts Institute of Technology.
- Liu, Y. and Kirchhoff, K. (2013). Graph-based semi-supervised learning for phone and segment classification. In Proceedings of Interspeech.
- Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and Ng, A. Y. (2011). Reading digits in natural images with unsupervised feature learning. In NIPS workshop on deep learning and unsupervised feature learning.
- Pal, C., Sutton, C., and McCallum, A. (2005). Fast inference and learning with sparse belief propagation. In Advances in Neural Information Processing Systems (NIPS).
- Pitelis, N., Russell, C., and Agapito, L. (2014). Semi-supervised learning using an unsupervised atlas. In Proceddings of the European Conference on Machine Learning (ECML), volume LNCS 8725, pages 565 – 580.
- Ranzato, M. and Szummer, M. (2008). Semi-supervised learning of compact document representations with deep networks. In Proceedings of the 25th International Conference on Machine Learning (ICML), pages 792–799.
- Rezende, D. J., Mohamed, S., and Wierstra, D. (2014). Stochastic backpropagation and approximate inference in deep generative models. In Proceedings of the International Conference on Machine Learning (ICML), volume 32 of JMLR W&CP.
- Rifai, S., Dauphin, Y., Vincent, P., Bengio, Y., and Muller, X. (2011). The manifold tangent classifier. In Advances in Neural Information Processing Systems (NIPS), pages 2294–2302.
- Rosenberg, C., Hebert, M., and Schneiderman, H. (2005). Semi-supervised self-training of object detection models. In Proceedings of the Seventh IEEE Workshops on Application of Computer Vision (WACV/MOTION’05).
- Shi, M. and Zhang, B. (2011). Semi-supervised learning improves gene expression-based prediction of cancer recurrence. Bioinformatics, 27(21):3017–3023.
- Stuhlmuller, A., Taylor, J., and Goodman, N. (2013). Learning stochastic inverses. In Advances in neural information processing systems, pages 3048–3056.
- Tang, Y. and Salakhutdinov, R. (2013). Learning stochastic feedforward neural networks. In Advances in Neural Information Processing Systems (NIPS), pages 530–538.
- Wang, Y., Haffari, G., Wang, S., and Mori, G. (2009). A rate distortion approach for semi-supervised conditional random fields. In Advances in Neural Information Processing Systems (NIPS), pages 2008–2016.
- Weston, J., Ratle, F., Mobahi, H., and Collobert, R. (2012). Deep learning via semi-supervised embedding. In Neural Networks: Tricks of the Trade, pages 639–655. Springer.
- Zhu, X. (2006). Semi-supervised learning literature survey. Technical report, Computer Science, University of Wisconsin-Madison.
- Zhu, X., Ghahramani, Z., Lafferty, J., et al. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. In Proceddings of the International Conference on Machine Learning (ICML), volume 3, pages 912–919.
Tags
Comments