Interpretable Transformations with Encoder-Decoder Networks

2017 IEEE International Conference on Computer Vision (ICCV)(2017)

引用 104|浏览79
暂无评分
摘要
Deep feature spaces have the capacity to encode complex transformations of their input data. However, understanding the relative feature-space relationship between two transformed encoded images is difficult. For instance, what is the relative feature space relationship between two rotated images? What is decoded when we interpolate in feature space? Ideally, we want to disentangle confounding factors, such as pose, appearance, and illumination, from object identity. Disentangling these is difficult because they interact in very nonlinear ways. We propose a simple method to construct a deep feature space, with explicitly disentangled representations of several known transformations. A person or algorithm can then manipulate the disentangled representation, for example, to re-render an image with explicit control over parameterized degrees of freedom. The feature space is constructed using a transforming encoder-decoder network with a custom feature transform layer, acting on the hidden representations. We demonstrate the advantages of explicit disentangling on a variety of datasets and transformations, and as an aid for traditional tasks, such as classification.
更多
查看译文
关键词
interpretable transformations,encoder-decoder networks,deep feature space,relative feature-space relationship,transformed encoded images,relative feature space relationship,explicitly disentangled representations,disentangled representation,transforming encoder-decoder network,explicit disentangling,image rotation,complex transformation encoding,custom feature transform layer,parameterized degrees of freedom
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要