Modeling Cross-View Interaction Consistency for Paired Egocentric Interaction Recognition
2020 IEEE International Conference on Multimedia and Expo (ICME)(2020)
摘要
With the development of Augmented Reality (AR), egocentric action recognition (EAR) plays an important role in accurately understanding demands from the user. However, EAR is designed to help recognize human-machine interaction in single egocentric view, thus difficult to capture interactions between two face-to-face AR users. Paired egocentric interaction recognition (PEIR) is the task to collaboratively recognize the interactions between two persons with the videos in their corresponding views. Unfortunately, existing PEIR methods always directly use linear decision function to fuse the features extracted from two corresponding egocentric videos, which ignore the consistency of interaction in paired egocentric videos. The consistency of interactions in paired videos, and features extracted from them, are correlated to each other. On top of that, we propose to derive the relevance between two views using bilinear pooling, which captures the consistency of two views in feature-level. Specifically, each neuron in the feature maps from one view connects to the neurons from the other view, which enforces the compact consistency between two views and then all possible paired neurons are used for PEIR. To be efficient, we use compact bilinear pooling with Count Sketch to avoid directly computing outer product. Experimental results on the PEV dataset shows the superiority of the proposed methods on the task PEIR.
更多查看译文
关键词
Paired egocentric interaction recognition,bilinear pooling,action recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络