Implicit Decouple Network for Efficient Pose Estimation
MM '23: Proceedings of the 31st ACM International Conference on Multimedia(2023)
摘要
In the field of pose estimation, keypoint representations can take the form of Gaussian heatmaps, classification vectors, or direct coordinates. However, the current networks suffer from a lack of consistency with these keypoint representations. They only accommodate these representations in the final layer, resulting in suboptimal efficiency and requiring a high number of parameters or computational resources. In this paper, we propose a simple yet efficient plug-and-play module, named the Implicit Decouple Module (IDM), which decouples features into two parts along the x-y axes and aggregates features in a direction-aware manner. This approach implicitly fuses direction-specific coordinate information, improving the consistency with the keypoint representations, especially in vector form. Furthermore, we introduce a fully convolutional backbone network, named the Implicit Decouple Network (IDN), which incorporates IDM without the need to maintain high-resolution features, dense multi-level feature fusion, or lots of repeated stages, while still achieving high performance. In experiments on the COCO dataset, our basic IDN without pre-training can outperform HRNet (28.5M) by 2.4 AP with 18.2M parameters, and even surpass some transformer-based methods. In the lightweight model scenario, our model outstrips Lite-HRNet by 3.9 AP with only 2.5M parameters. We also evaluate our model on the person instance segmentation task and other datasets, demonstrating its generality and effectiveness. http(s)://znk.ink/su/mm23idn.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要