Learning to Extract Parameterized Features by Predicting Transformed Images

user-5ebe28934c775eda72abcddd(2011)

引用 2|浏览64
暂无评分
摘要
Interesting data such as images, videos and speech are often highly structured-real photos are far from a random set of pixels. Because of this structure, the pixel representation is inefficient. Performance on tasks like classification, tracking, object recognition etc. can be improved by first extracting features. A variety of methods have been developed for feature extraction from adaptive methods like PCA, k-means, Gaussian mixture models, restricted Boltzmann machine (RBM), autoencoder to hand-crafted features like wavelets, oriented Gabor filters, SIFT etc.Individual components within these adaptive representations are either independent from each other or hard to interpret. In PCA, kmeans, and GMM, the representation of a horizontal edge at one location has nothing to do with the representation of another horizontal edge at some other location. On the other hand, codes extracted by RBM and deep autoencoder are hard to interpret and change unpredictably under simple transformations. To overcome this limitation, parameterized features with explicit pose parameters are desired. Although pose is already present in hand-crafted features like SIFT features (the pose is in its descriptor), such representations have never been learned. In this thesis, we train “capsules” to perform sophisticated computation, and encapsulate the result in a small vector of instantiation parameters representing the pose. The method used to obtain these capsules and extract these instantiation parameters, called the transforming autoencoder, is introduced. The transforming autoencoder is trained on pairs of images related by a transformation while having direct …
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要