Etude de techniques d'apprentissage non-supervise pour l'amelioration de l'entrainement supervise de modeles connexionnistes

Etude de techniques d'apprentissage non-supervise pour l'amelioration de l'entrainement supervise de modeles connexionnistes(2009)

引用 23|浏览21
暂无评分
摘要
The objective of the field of artificial intelligence is the development of computer systems capable of simulating a behavior reminiscent of human intelligence. In particular, we would like to build a machine that would be able to solve tasks related to vision (e.g., object recognition), natural language (e.g., topic classification) or signal processing (e.g., speech recognition). The general approach developed in the sub-field of machine learning to solve such tasks is based on using labeled data to train a model to emulate the desired behavior. One such model that has been proposed is the artificial neural network, which can adapt its behavior using a backpropagated gradient [103, 135] that is informative of the errors made by the network. Popular during the 80's, this specific approach has since lost some of its appeal, following the development of kernel methods. Indeed, kernel methods are often found to be more stable, easier to use, and their performance usually compares favorably on a vast range of problems. Since the foundation of the field, machine learning methods have progressed not only in their inner workings, but also in the complexity of problems they can tackle. More recently however, it has been argued [12, 15] that kernel methods might not be able to solve efficiently enough problems of the complexity that is expected from artificial intelligence. At the same time, Hinton et al. [84] put forth a breakthrough in neural network training, by developing a procedure able to train more complex neural networks (i.e., with more layers of hidden neurons) than previously possible. This is the context in which the work presented in this thesis started. This thesis begins with the introduction of the basic principles of machine learning (Chapter 1) as well as the known obstacles to achieve good generalization performance (Chapter 2). Then, the work from five papers is presented, with each of these papers' contribution relying on a form of unsupervised learning. The first paper (Chapter 4) presents a training method for a specific form of single hidden layer neural network (the Restricted Boltzmann Machine), based on the combination of supervised and unsupervised learning. This method achieves a better generalization performance than a standard neural network and a kernel support vector machine. This observation emphasizes the beneficial effect of unsupervised learning for training neural networks. Then, the second paper (Chapter 6) studies and extends the training procedure of Hinton et al. [84]. More specifically, we propose a different but more flexible approach for initializing a deep (i.e., with many hidden layers) neural network, based on autoassociator networks. We also empirically analyze the impact of varying the number of layers and number of hidden neurons on the performance of a neural network, and we describe variants of the same training procedure that are more appropriate for continuous-valued inputs and online learning. The third paper (Chapter 8) describes a more intensive empirical evaluation of training algorithms for deep networks, on several classification problems. These problems have been generated based on several factors of variations, in order to simulate a property that is expected from artificial intelligence problems. The experiments presented in this paper tend to show that deep networks are more appropriate than shallow models, such as kernel methods. The fourth paper (Chapter 10) develops an improved variation of the autoassociator network. This simple variation, which brings better generalization performance to deep networks, modifies the autoassociator network's training procedure by corrupting its input and forcing the network to denoise it. The fifth and final paper (Chapter 12) contributes another improved variation on autoassociator networks, by allowing inhibitory/facilitatory interactions between the hidden layer neurons. We show that such interactions can be learned and can be beneficial to the performance of deep networks. Keywords: unsupervised learning, neural network, Restricted Boltzmann Machine, autoassociator, autoencoder, deep architecture, deep learning
更多
查看译文
关键词
artificial intelligence,autoassociator network,deep network,neural network,hidden neuron,generalization performance,improved variation,Etude de technique,unsupervised learning,training procedure,kernel method
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要