Cnn2: Viewpoint Generalization Via A Binocular Vision

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019)（2019）

引用 0|浏览0

暂无评分

摘要

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN2, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye. CNN2 uses novel augmentation, pooling, and convolutional layers to learn a sense of three-dimensionality in a recursive manner. Empirical evaluation shows that CNN2 has improved viewpoint generalizability compared to vanilla CNNs. Furthermore, CNN2 is easy to implement and train, and is compatible with existing CNN-based specialized techniques for different applications.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要