Spatial-temporal hypergraph based on dual-stage attention network for multi-view data lightweight action recognition

Zhixuan Wu,Nan Ma, Cheng Wang,Cheng Xu, Genbao Xu,Mingxing Li

Pattern Recognition(2024)

引用 0|浏览0
暂无评分
摘要
For the problems of irrelevant frames and high model complexity in action recognition, we propose a Spatial-Temporal Hypergraph based on Dual-Stage Attention Network (STHG-DAN) for multi-view data lightweight action recognition. It includes two stages: Temporal Attention Mechanism based on Trainable Threshold (TAM-TT) and Hypergraph Convolution based on Dynamic Spatial-Temporal Attention Mechanism (HG-DSTAM). In the first stage, TAM-TT uses a learning threshold to extract keyframes from multi-view videos, with the multi-view data serving as a guarantee for providing more comprehensive information subsequently; In the second stage, HG-DSTAM divides the human joints into three parts: trunk, hand and leg to build spatial–temporal hypergraphs, extracts high-order features from spatial–temporal hypergraphs constructed of multi-view human body joints, inputs them into the dynamic spatial–temporal attention mechanism, and learns the intra frame correlation of multi-view data between the joint features of body parts, which can obtain the significant areas of action; We use multi-scale convolution operation and depth separable network, which can realize efficient action recognition with a few trainable parameters. We experiment on the NTU-RGB+D, NTU-RGB+D 120 and the imitating traffic police gesture dataset. The performance and accuracy of the model are better than the existing algorithms, effectively improving the machine and human body language interaction cognitive ability.
更多
查看译文
关键词
Dual-stage attention network,Salient region,Spatial–temporal hypergraph neural network,Multi-view,Action recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要