M3Act: Learning from Synthetic Human Group Activities
arxiv(2023)
摘要
The study of complex human interactions and group activities has become a
focal point in human-centric computer vision. However, progress in related
tasks is often hindered by the challenges of obtaining large-scale labeled
datasets from real-world scenarios. To address the limitation, we introduce
M3Act, a synthetic data generator for multi-view multi-group multi-person human
atomic actions and group activities. Powered by Unity Engine, M3Act features
multiple semantic groups, highly diverse and photorealistic images, and a
comprehensive set of annotations, which facilitates the learning of
human-centered tasks across single-person, multi-person, and multi-group
conditions. We demonstrate the advantages of M3Act across three core
experiments. The results suggest our synthetic dataset can significantly
improve the performance of several downstream methods and replace real-world
datasets to reduce cost. Notably, M3Act improves the state-of-the-art MOTRv2 on
DanceTrack dataset, leading to a hop on the leaderboard from 10th to 2nd place.
Moreover, M3Act opens new research for controllable 3D group activity
generation. We define multiple metrics and propose a competitive baseline for
the novel task. Our code and data are available at our project page:
http://cjerry1243.github.io/M3Act.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要