Stacking-Based Attention Temporal Convolutional Network for Action Segmentation

Liu Yang,Yu Jiang, Junkun Hong,Zhenjie Wu,Zhan Yang,Jun Long

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)（2023）

引用 0|浏览0

暂无评分

摘要

Action segmentation plays an important role in video understanding, which is implemented by frame-wise action classification. Recent works on action segmentation capture long-term dependencies by increasing temporal convolution layers in Temporal Convolution Networks (TCNs). However, high layers in TCNs are more coarse access to video features, resulting in the loss of fine-grained information for frame-wise action classification. To address the above issues, we propose a novel Attention-based Temporal Convolution (ATC) block to capture fine-grained information of temporal dependencies for frame-wise action classification by self-attention mechanism. Via stacking ATC blocks, we design a Stacking-based Attention Temporal Convolutional Network (SATC) to adaptively capture long-term and short-term dependencies, according to the semantic similarity of features on different temporal receptive fields simultaneously. The experimental results demonstrate that our SATC outperforms other baselines on all three challenging datasets: GTEA, 50Salads and Breakfast.

查看译文

关键词

action segmentation,frame-wise action classification,temporal convolution network,deep learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要