Learning a Robust Model with Pseudo Boundaries for Noisy Temporal Action Localization.

Xinyi Yuan,Liansheng Zhuang

ACM Multimedia Asia(2023)

引用 0|浏览6
Temporal Action Localization (TAL) aims to locate starting and ending times of actions and recognize categories in untrimmed videos. Significant progress has been made in developing deep models for TAL. The success of previous methods relies on large-scale training data with precise boundary annotations. However, fully accurate annotations are unpractical to be obtained due to the ambiguities of the action boundaries and the crowd-sourcing labeling process, leading to a degradation in performance. In this work, we take the first step into learning with inaccurate boundaries in TAL tasks. Motivated by the fact that inaccurate boundary annotations harm localization precision more than classification accuracy, we propose to use classification as a guidance signal to improve localization precision. Specifically, we introduce a pseudo-boundary generation and refinement method (PbGaR). PbGaR first treats each action segment as a bag of instances to select the instances with more accurate boundaries for training. Then these boundaries are refined via two strategies for higher quality. The proposed method significantly alleviates the degraded performance of TAL models under inaccurate boundaries. Extensive experiments on two popular datasets demonstrate the effectiveness of our method.
AI 理解论文
Chat Paper