Surveillance Video Parsing with Single Frame Supervision

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2016)

引用 67|浏览77
暂无评分
摘要
Surveillance video parsing, which segments the video frames into several labels, e.g., face, pants, left-leg, has wide applications. However,pixel-wisely annotating all frames is tedious and inefficient. In this paper, we develop a Single frame Video Parsing (SVP) method which requires only one labeled frame per video in training stage. To parse one particular frame, the video segment preceding the frame is jointly considered. SVP (1) roughly parses the frames within the video segment, (2) estimates the optical flow between frames and (3) fuses the rough parsing results warped by optical flow to produce the refined parsing result. The three components of SVP, namely frame parsing, optical flow estimation and temporal fusion are integrated in an end-to-end manner. Experimental results on two surveillance video datasets show the superiority of SVP over state-of-the-arts.
更多
查看译文
关键词
surveillance video parsing,single frame supervision,left-leg,labeled frame,rough parsing results,refined parsing result,optical flow estimation,surveillance video datasets,video parsing datasets,video frame segmentation,single frame video parsing method,SVP,training stage,temporal fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要