Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
NIPS, (2012): 109-117
Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider f...More
PPT (Upload PPT)
- Multi-class image segmentation and labeling is one of the most challenging and actively studied problems in computer vision.
- The accuracy of these approaches is necessarily restricted by the accuracy of unsupervised image segmentation, which is used to compute the regions on which the model operates.
- This limits the ability of region-based approaches to produce accurate label assignments around complex object boundaries, significant progress has been made [9, 13, 14]
- Multi-class image segmentation and labeling is one of the most challenging and actively studied problems in computer vision
- Basic conditional random field (CRF) models are composed of unary potentials on individual pixels or image patches and pairwise potentials on neighboring pixels or patches [19, 23, 7, 5]
- In order to improve segmentation and labeling accuracy, researchers have expanded the basic CRF framework to incorporate hierarchical connectivity and higher-order potentials defined on image regions [8, 12, 9, 13]
- We evaluate the presented algorithm on two standard benchmarks for multi-class image segmentation and labeling
- The first is the MSRC-21 dataset, which consists of 591 color images of size 320 × 213 with corresponding ground truth labelings of 21 object classes 
- We have presented a highly efficient approximate inference algorithm for fully connected CRF models
- The authors evaluate the presented algorithm on two standard benchmarks for multi-class image segmentation and labeling.
- The first is the MSRC-21 dataset, which consists of 591 color images of size 320 × 213 with corresponding ground truth labelings of 21 object classes .
- The second is the PASCAL VOC 2010 dataset, which contains 1928 color images of size approximately 500 × 400, with a total of 20 object classes and one background class .
- The inference algorithm was implemented in a single CPU thread
- The authors have presented a highly efficient approximate inference algorithm for fully connected CRF models.
- The authors' results demonstrate that dense pixel-level connectivity leads to significantly more accurate pixel-level classification performance.
- The authors' single-threaded implementation processes benchmark images in a fraction of a second and the algorithm can be parallelized for further performance gains
- Philipp Krahenbuhl was supported in part by a Stanford Graduate Fellowship. background bird road cat void
- A. Adams, J. Baek, and M. A. Davis. Fast high-dimensional filtering using the permutohedral lattice. Computer Graphics Forum, 29(2), 2010. 2, 5
- A. Adams, N. Gelfand, J. Dolson, and M. Levoy. Gaussian kd-trees for fast high-dimensional filtering. ACM Transactions on Graphics, 28(3), 2009. 2
- M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes (VOC) challenge. IJCV, 88(2), 2010. 6, 7
- P. F. Felzenszwalb, R. B. Girshick, and D. A. McAllester. Cascade object detection with deformable part models. In Proc. CVPR, 2010. 5
- B. Fulkerson, A. Vedaldi, and S. Soatto. Class segmentation and object localization with superpixel neighborhoods. In Proc. ICCV, 2009. 1
- C. Galleguillos, A. Rabinovich, and S. Belongie. Object categorization using co-occurrence, location and appearance. In Proc. CVPR, 2008. 1
- S. Gould, J. Rodgers, D. Cohen, G. Elidan, and D. Koller. Multi-class segmentation with relative location prior. IJCV, 80(3), 2008. 1
- X. He, R. S. Zemel, and M. A. Carreira-Perpinan. Multiscale conditional random fields for image labeling. In Proc. CVPR, 2004. 1
- P. Kohli, L. Ladicky, and P. H. S. Torr. Robust higher order potentials for enforcing label consistency. IJCV, 82(3), 2001, 2, 6, 7
- D. Koller and N. Friedman. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009. 3
- V. Kolmogorov and R. Zabih. What energy functions can be minimized via graph cuts? PAMI, 26(2), 2004. 2
- S. Kumar and M. Hebert. A hierarchical field framework for unified context-based classification. In Proc. ICCV, 2005. 1
- L. Ladicky, C. Russell, P. Kohli, and P. H. S. Torr. Associative hierarchical crfs for object class image segmentation. In Proc. ICCV, 2009. 1, 5
- L. Ladicky, C. Russell, P. Kohli, and P. H. S. Torr. Graph cut based inference with co-occurrence statistics. In Proc. ECCV, 2010. 1
- J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. ICML, 2001. 3
- S. Paris and F. Durand. A fast approximation of the bilateral filter using a signal processing approach. IJCV, 81(1), 2009. 2, 4
- N. Payet and S. Todorovic. (RF)2 – random forest random field. In Proc. NIPS. 2010. 1, 2
- A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie. Objects in context. In Proc.
- J. Shotton, J. M. Winn, C. Rother, and A. Criminisi. Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. IJCV, 81(1), 2009. 1, 3, 5, 6
- S. W. Smith. The scientist and engineer’s guide to digital signal processing. California Technical Publishing, 1997. 4
- A. Torralba, K. P. Murphy, and W. T. Freeman. Sharing visual features for multiclass and multiview object detection. PAMI, 29(5), 2007. 5
- T. Toyoda and O. Hasegawa. Random field model for integration of local information and global information. PAMI, 30, 2008. 1
- J. J. Verbeek and B. Triggs. Scene segmentation with crfs learned from partially labeled images. In Proc. NIPS, 2007. 1