Efficient and Effective Weakly-Supervised Action Segmentation via Action-Transition-Aware Boundary Alignment
arxiv(2024)
摘要
Weakly-supervised action segmentation is a task of learning to partition a
long video into several action segments, where training videos are only
accompanied by transcripts (ordered list of actions). Most of existing methods
need to infer pseudo segmentation for training by serial alignment between all
frames and the transcript, which is time-consuming and hard to be parallelized
while training. In this work, we aim to escape from this inefficient alignment
with massive but redundant frames, and instead to directly localize a few
action transitions for pseudo segmentation generation, where a transition
refers to the change from an action segment to its next adjacent one in the
transcript. As the true transitions are submerged in noisy boundaries due to
intra-segment visual variation, we propose a novel Action-Transition-Aware
Boundary Alignment (ATBA) framework to efficiently and effectively filter out
noisy boundaries and detect transitions. In addition, to boost the semantic
learning in the case that noise is inevitably present in the pseudo
segmentation, we also introduce video-level losses to utilize the trusted
video-level supervision. Extensive experiments show the effectiveness of our
approach on both performance and training speed.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要