HSGNet: hierarchically stacked graph network with attention mechanism for 3D human pose estimation

Multimedia Systems(2023)

引用 0|浏览17
暂无评分
摘要
With the powerful representative ability of learning human skeleton, the graph convolutional network (GCN) is a popular baseline for 3D human pose estimation (HPE). However, current GCN-based 3D HPE methods primarily use “message-passing” architectures to aggregate the node information through the edges of the graph at “one scale”. In such architectures, the learnt node features are uniform and cannot learn hierarchical representation of the graph-structured data. In this study, a hierarchically stacked graph network (HSGNet) with attention constraint for 3D HPE was proposed. An attention-constrained GCN layer (AGCN) was designed as the basic unit for constructing the HSGNet. With the specially designed AGCN layer, we computed the attention coefficients for each node to pick the most important node and suppressed the redundant information from the neighbors in feature aggregation. Then, a coarse graph layer with pooling map was devised for stacking the multiple GCN layers in a hierarchical manner, where a pooling map matrix was used to cluster the nodes for graph representation according to the human skeleton structure. Finally, an HSGNet was constructed in an encoder–decoder framework to further embed the global and local information of the full skeleton to achieve the final embedding feature for 3D pose regression. Our method was validated on two benchmark datasets: Human3.6M and MPI-INF-3DHP. Experimental results showed that the proposed method yielded good performance for 3D HPE.
更多
查看译文
关键词
Pose estimation,Hierarchical stacked graph network,Attention mechanism,Encoder–decoder framework
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要