Memory Aggregation Networks for Efficient Interactive Video Object Segmentation

CVPR(2020)

引用 86|浏览591
暂无评分
摘要
Interactive video object segmentation (iVOS) aims at efficiently harvesting high-quality segmentation masks of the target object in a video with user interactions. Most previous state-of-the-arts tackle the iVOS with two independent networks for conducting user interaction and temporal propagation, respectively, leading to inefficiencies during the inference stage. In this work, we propose a unified framework, named Memory Aggregation Networks (MA-Net), to address the challenging iVOS in a more efficient way. Our MA-Net integrates the interaction and the propagation operations into a single network, which significantly promotes the efficiency of iVOS in the scheme of multi-round interactions. More importantly, we propose a simple yet effective memory aggregation mechanism to record the informative knowledge from the previous interaction rounds, improving the robustness in discovering challenging objects of interest greatly. We conduct extensive experiments on the validation set of DAVIS Challenge 2018 benchmark. In particular, our MA-Net achieves the J@60 score of 76.1% without any bells and whistles, outperforming the state-of-the-arts with more than 2.7%.
更多
查看译文
关键词
efficient interactive video object segmentation,high-quality segmentation masks,target object,user interaction,previous state-of-the-arts,independent networks,temporal propagation,propagation operations,multiround interactions,interaction rounds,memory aggregation mechanism,MA-Net,iVOS,named memory aggregation networks,DAVIS Challenge 2018 benchmark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要