Differentiable Multi-Granularity Human Parsing

IEEE Transactions on Pattern Analysis and Machine Intelligence(2023)

引用 4|浏览4
暂无评分
摘要
In this work, we study the challenging problem of instance-aware human body part parsing. We introduce a new bottom-up regime which achieves the task through learning category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. The output is a compact, efficient and powerful framework that exploits structural information over different human granularities and eases the difficulty of person partitioning. Specifically, a dense-to-sparse projection field, which allows explicitly associating dense human semantics with sparse keypoints, is learnt and progressively improved over the network feature pyramid for robustness. Then, the difficult pixel grouping problem is cast as an easier, multi-person joint assembling task. By formulating joint association as maximum-weight bipartite matching, we develop two novel algorithms based on projected gradient descent and unbalanced optimal transport, respectively, to solve the matching problem differentiablly. These algorithms make our method end-to-end trainable and allow back-propagating the grouping error to directly supervise multi-granularity human representation learning. This is significantly distinguished from current bottom-up human parsers or pose estimators which require sophisticated post-processing or heuristic greedy algorithms. Extensive experiments on three instance-aware human parsing datasets ( i.e ., MHP-v2, DensePose-COCO, PASCAL-Person-Part) demonstrate that our approach outperforms most existing human parsers with much more efficient inference. Our code is available at https://github.com/tfzhou/MG-HumanParsing .
更多
查看译文
关键词
Instance-aware human semantic parsing,multi-person pose estimation,multi-granularity human representation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要