High-performance Semantic Segmentation Using Very Deep Fully Convolutional Networks.

arXiv: Computer Vision and Pattern Recognition(2016)

引用 135|浏览177
暂无评分
摘要
We propose a method for high-performance semantic image segmentation (or semantic pixel labelling) based on very deep residual networks, which achieves the state-of-the-art performance. A few design factors are carefully considered to this end. We make the following contributions. (i) First, we evaluate different variations of a fully convolutional residual network so as to find the best configuration, including the number of layers, the resolution of feature maps, and the size of field-of-view. Our experiments show that further enlarging the field-of-view and increasing the resolution of feature maps are typically beneficial, which however inevitably leads to a higher demand for GPU memories. To walk around the limitation, we propose a new method to simulate a high resolution network with a low resolution network, which can be applied during training and/or testing. (ii) Second, we propose an online bootstrapping method for training. We demonstrate that online bootstrapping is critically important for achieving good accuracy. (iii) Third we apply the traditional dropout to some of the residual blocks, which further improves the performance. (iv) Finally, our method achieves the currently best mean intersection-over-union 78.3% on the PASCAL VOC 2012 dataset, as well as on the recent dataset Cityscapes. ∗This research was in part supported by the Data to Decisions Cooperative Research Centre. C. Shen’s participation was in part supported by an ARC Future Fellowship (FT120100969). C. Shen is the corresponding author. 1 ar X iv :1 60 4. 04 33 9v 1 [ cs .C V ] 1 5 A pr 2 01 6
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要