3D Deformable Convolution Temporal Reasoning network for action recognition
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION(2023)
摘要
Modeling and reasoning of the interactions between multiple entities (actors and objects) are beneficial for the action recognition task. In this paper, we propose a 3D Deformable Convolution Temporal Reasoning (DCTR) network to model and reason about the latent relationship dependencies between different entities in videos. The proposed DCTR network consists of a spatial modeling module and a temporal reasoning module. The spatial modeling module uses 3D deformable convolution to capture relationship dependencies between different entities in the same frame, while the temporal reasoning module uses Conv-LSTM to reason about the changes of multiple entity relationship dependencies in the temporal dimension. Experiments on the Moments-in-Time dataset, UCF101 dataset and HMDB51 dataset demonstrate that the proposed method outperforms several state-of-the-art methods.
更多查看译文
关键词
Action recognition,3D deformable convolutional network,Reasoning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要