IAST: Instance Association Relying on Spatio-Temporal Features for Video Instance Segmentation

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览0
暂无评分
摘要
Most offline video instance segmentation (VIS) methods lack consideration for multi-scale spatio-temporal features, which leads to unstable instance association across frames. To address this problem, we propose IAST that builds Instance Association relying on Spatio-Temporal features for video instance segmentation. In detail, we design a novel Scale-to-Scale Attention Module in the encoder of IAST, which constructs stable cross-frame instance associations by completely leveraged multi-scale spatio-temporal features. In addition, we introduce a new data augmentation method called Sequential Copy-Paste, which effectively alleviates the overfitting problem caused by insufficient training data and enhances the robustness of the model. Empirically, IAST achieves the state-of-the-art VIS benchmarks with a ResNet-50 backbone: 47.4% AP, 41.6% AP on YouTube-VIS 2019 & 2021. Such achievements significantly outperform the previous state-of-the-art performance of 1.0% at the expense of fewer parameters. Code is available at https://github.com/clozureyez/IAST.
更多
查看译文
关键词
video instance segmentation,spatio-temporal features,instance association,data augmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要