IAST: Instance Association Relying on Spatio-Temporal Features for Video Instance Segmentation

Junhao Chen,Sheng Liu,Ruixiang Chen,Bingnan Guo,Feng Zhang

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)（2023）

引用 0|浏览0

暂无评分

摘要

Most offline video instance segmentation (VIS) methods lack consideration for multi-scale spatio-temporal features, which leads to unstable instance association across frames. To address this problem, we propose IAST that builds Instance Association relying on Spatio-Temporal features for video instance segmentation. In detail, we design a novel Scale-to-Scale Attention Module in the encoder of IAST, which constructs stable cross-frame instance associations by completely leveraged multi-scale spatio-temporal features. In addition, we introduce a new data augmentation method called Sequential Copy-Paste, which effectively alleviates the overfitting problem caused by insufficient training data and enhances the robustness of the model. Empirically, IAST achieves the state-of-the-art VIS benchmarks with a ResNet-50 backbone: 47.4% AP, 41.6% AP on YouTube-VIS 2019 & 2021. Such achievements significantly outperform the previous state-of-the-art performance of 1.0% at the expense of fewer parameters. Code is available at https://github.com/clozureyez/IAST.

查看译文

关键词

video instance segmentation,spatio-temporal features,instance association,data augmentation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要