Exploiting Temporal Correlations for 3D Human Pose Estimation

IEEE TRANSACTIONS ON MULTIMEDIA(2024)

引用 0|浏览7
暂无评分
摘要
Exploiting the rich temporal information in human pose sequences to facilitate 3D pose estimation has garnered particular attention. While various learning architectures have been designed for temporal exploiting, these architectures are usually trained via the 3D pose loss independently imposed on every single frame, without explicit temporal signals introduced for supervision. This inevitably increases the difficulty of temporal exploiting, since the network must reason about the meaningful temporal information based on the non-temporal single-frame supervision first. Only then, the network can utilize this information to guide sequence modeling. Recently, some work introduce temporal smoothness as an explicit supervision signal, which makes the network more straightforwardly reaches the temporal information from the supervision signal, thus improving the temporal exploiting. However, the temporal smoothness only roughly measures the short-term temporal properties between adjacent frame pairs. In this work, we propose to generalize the supervision of temporal smoothness to temporal correlations, letting the network precisely consider more comprehensive temporal properties in sequences. We contribute two novel correlation-based loss functions, which adopt different strategies to respectively regularize the encoder and decoder sides of the network for temporal exploiting. Besides, we design a pre-training scheme to ensure a general convergence of existing pose estimators under our correlation losses. Experiments on three benchmarks demonstrate that our method can be compatible with different networks, improving their temporal exploiting ability to output more accurate and robust pose estimations.
更多
查看译文
关键词
3D human pose estimation,Temporal correlation,Sequence modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要