Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

NIPS, (2012): 109-117

Cited by: 2680|Views311
EI
Full Text
Bibtex
Weibo

Abstract

Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider f...More

Code:

Data:

0
Introduction
  • Multi-class image segmentation and labeling is one of the most challenging and actively studied problems in computer vision.
  • The accuracy of these approaches is necessarily restricted by the accuracy of unsupervised image segmentation, which is used to compute the regions on which the model operates.
  • This limits the ability of region-based approaches to produce accurate label assignments around complex object boundaries, significant progress has been made [9, 13, 14]
Highlights
  • Multi-class image segmentation and labeling is one of the most challenging and actively studied problems in computer vision
  • Basic conditional random field (CRF) models are composed of unary potentials on individual pixels or image patches and pairwise potentials on neighboring pixels or patches [19, 23, 7, 5]
  • In order to improve segmentation and labeling accuracy, researchers have expanded the basic CRF framework to incorporate hierarchical connectivity and higher-order potentials defined on image regions [8, 12, 9, 13]
  • We evaluate the presented algorithm on two standard benchmarks for multi-class image segmentation and labeling
  • The first is the MSRC-21 dataset, which consists of 591 color images of size 320 × 213 with corresponding ground truth labelings of 21 object classes [19]
  • We have presented a highly efficient approximate inference algorithm for fully connected CRF models
Results
  • The authors evaluate the presented algorithm on two standard benchmarks for multi-class image segmentation and labeling.
  • The first is the MSRC-21 dataset, which consists of 591 color images of size 320 × 213 with corresponding ground truth labelings of 21 object classes [19].
  • The second is the PASCAL VOC 2010 dataset, which contains 1928 color images of size approximately 500 × 400, with a total of 20 object classes and one background class [3].
  • The inference algorithm was implemented in a single CPU thread
Conclusion
  • The authors have presented a highly efficient approximate inference algorithm for fully connected CRF models.
  • The authors' results demonstrate that dense pixel-level connectivity leads to significantly more accurate pixel-level classification performance.
  • The authors' single-threaded implementation processes benchmark images in a fraction of a second and the algorithm can be parallelized for further performance gains
Funding
  • Philipp Krahenbuhl was supported in part by a Stanford Graduate Fellowship. background bird road cat void
Reference
  • A. Adams, J. Baek, and M. A. Davis. Fast high-dimensional filtering using the permutohedral lattice. Computer Graphics Forum, 29(2), 2010. 2, 5
    Google ScholarLocate open access versionFindings
  • A. Adams, N. Gelfand, J. Dolson, and M. Levoy. Gaussian kd-trees for fast high-dimensional filtering. ACM Transactions on Graphics, 28(3), 2009. 2
    Google ScholarLocate open access versionFindings
  • M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes (VOC) challenge. IJCV, 88(2), 2010. 6, 7
    Google ScholarLocate open access versionFindings
  • P. F. Felzenszwalb, R. B. Girshick, and D. A. McAllester. Cascade object detection with deformable part models. In Proc. CVPR, 2010. 5
    Google ScholarLocate open access versionFindings
  • B. Fulkerson, A. Vedaldi, and S. Soatto. Class segmentation and object localization with superpixel neighborhoods. In Proc. ICCV, 2009. 1
    Google ScholarLocate open access versionFindings
  • C. Galleguillos, A. Rabinovich, and S. Belongie. Object categorization using co-occurrence, location and appearance. In Proc. CVPR, 2008. 1
    Google ScholarLocate open access versionFindings
  • S. Gould, J. Rodgers, D. Cohen, G. Elidan, and D. Koller. Multi-class segmentation with relative location prior. IJCV, 80(3), 2008. 1
    Google ScholarLocate open access versionFindings
  • X. He, R. S. Zemel, and M. A. Carreira-Perpinan. Multiscale conditional random fields for image labeling. In Proc. CVPR, 2004. 1
    Google ScholarLocate open access versionFindings
  • P. Kohli, L. Ladicky, and P. H. S. Torr. Robust higher order potentials for enforcing label consistency. IJCV, 82(3), 2001, 2, 6, 7
    Google ScholarLocate open access versionFindings
  • D. Koller and N. Friedman. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009. 3
    Google ScholarFindings
  • V. Kolmogorov and R. Zabih. What energy functions can be minimized via graph cuts? PAMI, 26(2), 2004. 2
    Google ScholarLocate open access versionFindings
  • S. Kumar and M. Hebert. A hierarchical field framework for unified context-based classification. In Proc. ICCV, 2005. 1
    Google ScholarLocate open access versionFindings
  • L. Ladicky, C. Russell, P. Kohli, and P. H. S. Torr. Associative hierarchical crfs for object class image segmentation. In Proc. ICCV, 2009. 1, 5
    Google ScholarLocate open access versionFindings
  • L. Ladicky, C. Russell, P. Kohli, and P. H. S. Torr. Graph cut based inference with co-occurrence statistics. In Proc. ECCV, 2010. 1
    Google ScholarLocate open access versionFindings
  • J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. ICML, 2001. 3
    Google ScholarLocate open access versionFindings
  • S. Paris and F. Durand. A fast approximation of the bilateral filter using a signal processing approach. IJCV, 81(1), 2009. 2, 4
    Google ScholarLocate open access versionFindings
  • N. Payet and S. Todorovic. (RF)2 – random forest random field. In Proc. NIPS. 2010. 1, 2
    Google ScholarLocate open access versionFindings
  • A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie. Objects in context. In Proc.
    Google ScholarLocate open access versionFindings
  • J. Shotton, J. M. Winn, C. Rother, and A. Criminisi. Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. IJCV, 81(1), 2009. 1, 3, 5, 6
    Google ScholarLocate open access versionFindings
  • S. W. Smith. The scientist and engineer’s guide to digital signal processing. California Technical Publishing, 1997. 4
    Google ScholarFindings
  • A. Torralba, K. P. Murphy, and W. T. Freeman. Sharing visual features for multiclass and multiview object detection. PAMI, 29(5), 2007. 5
    Google ScholarLocate open access versionFindings
  • T. Toyoda and O. Hasegawa. Random field model for integration of local information and global information. PAMI, 30, 2008. 1
    Google ScholarLocate open access versionFindings
  • J. J. Verbeek and B. Triggs. Scene segmentation with crfs learned from partially labeled images. In Proc. NIPS, 2007. 1
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科