Focus on Low-Resolution Information: Multi-Granular Information-Lossless Model for Low-Resolution Human Pose Estimation
arxiv(2024)
摘要
In real-world applications of human pose estimation, low-resolution input
images are frequently encountered when the performance of the image acquisition
equipment is limited or the shooting distance is too far. However, existing
state-of-the-art models for human pose estimation perform poorly on
low-resolution images. One key reason is the presence of downsampling layers in
these models, e.g., strided convolutions and pooling layers. It further reduces
the already insufficient image information. Another key reason is that the body
skeleton and human kinematic information are not fully utilized. In this work,
we propose a Multi-Granular Information-Lossless (MGIL) model to replace the
downsampling layers to address the above issues. Specifically, MGIL employs a
Fine-grained Lossless Information Extraction (FLIE) module, which can prevent
the loss of local information. Furthermore, we design a Coarse-grained
Information Interaction (CII) module to adequately leverage human body
structural information. To efficiently fuse cross-granular information and
thoroughly exploit the relationships among keypoints, we further introduce a
Multi-Granular Adaptive Fusion (MGAF) mechanism. The mechanism assigns weights
to features of different granularities based on the content of the image. The
model is effective, flexible, and universal. We show its potential in various
vision tasks with comprehensive experiments. It outperforms the SOTA methods by
7.7 mAP on COCO and performs well with different input resolutions, different
backbones, and different vision tasks. The code is provided in supplementary
material.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要