Cross Fusion For Egocentric Interactive Action Recognition

MULTIMEDIA MODELING (MMM 2020), PT I(2020)

引用 2|浏览84
暂无评分
摘要
The characteristics of egocentric interactive videos, which include heavy ego-motion, frequent viewpoint changes and multiple types of activities, hinder the action recognition methods of third-person vision from obtaining satisfactory results. In this paper, we introduce an effective architecture with two branches and a cross fusion method for action recognition in egocentric interactive vision. The two branches are responsible to model the information from observers and inter-actors respectively, and each branch is designed based on the multimodal multi-stream C3D networks. We leverage cross fusion to establish effective linkages between the two branches, which aims to reduce redundant information and fuse complementary features. Besides, we propose variable sampling to obtain discriminative snippets for training. Experimental results demonstrate that the proposed architecture achieves superior performance over several state-of-the-art methods on two benchmarks.
更多
查看译文
关键词
Egocentric interactive videos, Action recognition, Cross fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要