Class-agnostic Instance Segmentation with Foveated Image Sampling

Marissa Anthea Weis, Alexander S. Ecker


引用 0|浏览0
Instance segmentation in computer vision is the task to detect and segment individual object instances in images. We propose the foveal segmenter, a convolutional neural network (CNN) that processes images in log-polar space, to tackle class-agnostic instance segmentation. The log-polar mapping is inspired by human vision that uses a high resolution center, the fovea, and has decreasing resolution towards the periphery. Processing images in log-polar coordinates instead of Cartesian coordinates is an intelligent downsampling strategy that has the major advantage that rotation and scaling relative to the polar origin reduce to translation, to which CNNs are inherently equivariant. Therefore, the foveal segmenter is equivariant to rotation and scale, enabling better usage of limited training data. Additionally, it retains a high resolution of the objects it is attending to, thereby improving the segmentation of small object instances. We report instance segmentation results for the common datasets COCO and Cityscapes and demonstrate that the foveal segmenter, while being a much simpler network, outperforms Mask R-CNN on segmenting small and medium sized objects in Cityscapes, when giving both networks the information, where the objects are located.
AI 理解论文
Chat Paper