Graphical models for high-level computer vision

Graphical models for high-level computer vision(2009)

引用 24|浏览61
暂无评分
摘要
The goal of image understanding has long been a shared goal in the field of computer vision. Extracting the required scene-level information from image data is a formidable task, however. While the raw data comes in the form of matrices of numbers, the inferences that we must perform occur at a much higher level of abstraction. Much progress has been made in recent years in extracting the primitives of an image in isolation, for example detecting the objects, labeling the regions, or extracting the surfaces. Modeling the interactions and fine-grained distinctions between these primitives is an important next step along the path to scene understanding. Because this involves reasoning about relationships between heterogeneous entities at a high level of abstraction, this problem lends itself well to the tools of probabilistic graphical models. In this thesis we consider two important challenges in this space. In the first, we model interactions between primitives of various types in order to capture the contextual relationships between them. For example, we can expect to find a sheep (a detected object) more often on a field of grass (a labeled region) than on a patch of water. By modeling the interactions between these components, we can expect to improve the quality of classification by leveraging these contextual cues. We first introduce Cascaded Classification Models (CCM), a flexible framework for combining various state-of-the-art vision models in a way that allows for improved performance of each model. We next consider the tasks of object detection and region labeling and develop a more sophisticated probabilistic model aimed at capturing the contextual relationships between these types of primitives in a more targeted and meaningful way. Our Things and Stuff (TAS) context model learns to leverage contextual cues directly from data. Following this exploration of interactions between objects, we consider the interactions of "parts" within an object, and in particular tackle the problem of descriptive querying of objects. This involves making distinctions about the object at a refined level, beyond mere categorization. For example, we may want to know whether a cheetah in an image is running or standing still. We introduce a probabilistic deformable shape model (LOOPS) and a method for matching this model to an image that allows for precise localization of several object classes. Using this localization, we show how descriptive distinctions can be drawn with a small amount of training data.
更多
查看译文
关键词
contextual cue,image data,probabilistic graphical model,model interaction,contextual relationship,image understanding,sophisticated probabilistic model,high-level computer vision,probabilistic deformable shape model,various state-of-the-art vision model,context model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要