Learning a Robust Model with Pseudo Boundaries for Noisy Temporal Action Localization.

ACM Multimedia Asia（2023）

Cited 0|Views10

No score

Abstract

Temporal Action Localization (TAL) aims to locate starting and ending times of actions and recognize categories in untrimmed videos. Significant progress has been made in developing deep models for TAL. The success of previous methods relies on large-scale training data with precise boundary annotations. However, fully accurate annotations are unpractical to be obtained due to the ambiguities of the action boundaries and the crowd-sourcing labeling process, leading to a degradation in performance. In this work, we take the first step into learning with inaccurate boundaries in TAL tasks. Motivated by the fact that inaccurate boundary annotations harm localization precision more than classification accuracy, we propose to use classification as a guidance signal to improve localization precision. Specifically, we introduce a pseudo-boundary generation and refinement method (PbGaR). PbGaR first treats each action segment as a bag of instances to select the instances with more accurate boundaries for training. Then these boundaries are refined via two strategies for higher quality. The proposed method significantly alleviates the degraded performance of TAL models under inaccurate boundaries. Extensive experiments on two popular datasets demonstrate the effectiveness of our method.

Translated text

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined