Implicit Decouple Network for Efficient Pose Estimation

MM '23: Proceedings of the 31st ACM International Conference on Multimedia(2023)

引用 0|浏览6
暂无评分
摘要
In the field of pose estimation, keypoint representations can take the form of Gaussian heatmaps, classification vectors, or direct coordinates. However, the current networks suffer from a lack of consistency with these keypoint representations. They only accommodate these representations in the final layer, resulting in suboptimal efficiency and requiring a high number of parameters or computational resources. In this paper, we propose a simple yet efficient plug-and-play module, named the Implicit Decouple Module (IDM), which decouples features into two parts along the x-y axes and aggregates features in a direction-aware manner. This approach implicitly fuses direction-specific coordinate information, improving the consistency with the keypoint representations, especially in vector form. Furthermore, we introduce a fully convolutional backbone network, named the Implicit Decouple Network (IDN), which incorporates IDM without the need to maintain high-resolution features, dense multi-level feature fusion, or lots of repeated stages, while still achieving high performance. In experiments on the COCO dataset, our basic IDN without pre-training can outperform HRNet (28.5M) by 2.4 AP with 18.2M parameters, and even surpass some transformer-based methods. In the lightweight model scenario, our model outstrips Lite-HRNet by 3.9 AP with only 2.5M parameters. We also evaluate our model on the person instance segmentation task and other datasets, demonstrating its generality and effectiveness. http(s)://znk.ink/su/mm23idn.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要