Recognising complex activities with histograms of relative tracklets.

Computer Vision and Image Understanding(2017)

引用 15|浏览36
暂无评分
摘要
We propose a method for activity recognition from video and accelerometer data.Visual accelerometer localization and tracking establishes cross-modal relations.RETLETS encode relative motion between tracked objects and local visual features.Recognition using various feature combinations is evaluated on the 50 Salads dataset.The method using RETLETS outperforms the state-of-the-art on this dataset. Display Omitted One approach to the recognition of complex human activities is to use feature descriptors that encode visual interactions by describing properties of local visual features with respect to trajectories of tracked objects. We explore an example of such an approach in which dense tracklets are described relative to multiple reference trajectories, providing a rich representation of complex interactions between objects of which only a subset can be tracked. Specifically, we report experiments in which reference trajectories are provided by tracking inertial sensors in a food preparation scenario. Additionally, we provide baseline results for HOG, HOF and MBH, and combine these features with others for multi-modal recognition. The proposed histograms of relative tracklets (RETLETS) showed better activity recognition performance than dense tracklets, HOG, HOF, MBH, or their combination. Our comparative evaluation of features from accelerometers and video highlighted a performance gap between visual and accelerometer-based motion features and showed a substantial performance gain when combining features from these sensor modalities. A considerable further performance gain was observed in combination with RETLETS and reference tracklet features.
更多
查看译文
关键词
Activity recognition,Relative tracklets,Sensor fusion,Food preparation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要