Person-level Action Recognition in Complex Events via TSD-TSM Networks

MM '20: The 28th ACM International Conference on Multimedia Seattle WA USA October, 2020(2020)

引用 11|浏览131
暂无评分
摘要
The task of person-level action recognition in complex events aims to densely detect pedestrians and individually predict their actions from surveillance videos. In this paper, we present a simple yet efficient pipeline for this task, referred to as TSD-TSM networks. Firstly, we adopt the TSD detector for the pedestrian localization on each single keyframe. Secondly, we generate the sequential ROIs for a person proposal by replicating the adjusted bounding box coordinates around the keyframe. Particularly, we propose to conduct straddling expansion and region squaring on the original bounding box of a person proposal to widen the potential space of motion and interaction and lead to a square box for ROI detection. Finally, we adapt the TSM classifier on the generated ROI sequences to perform action classification and further adopt late fusion to promote the prediction. Our proposed pipeline achieved the 3rd place in the ACM-MM 2020 grand challenge, i.e., Large-scale Human-centric Video Analysis in Complex Events (Track-4), obtaining final 15.31% [email protected] and 20.63% [email protected] on the testing set.
更多
查看译文
关键词
Human action recognition, pedestrian detection, complex events
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要