Graph Distillation for Action Detection with Privileged Information.
arXiv: Computer Vision and Pattern Recognition(2017)
摘要
In this work, we propose a technique that tackles the video understanding problem under a realistic, demanding condition in which we have limited labeled data and partially observed training modalities. Common methods such as transfer learning do not take advantage of the rich information from extra modalities potentially available in the source domain dataset. On the other hand, previous work on cross-modality learning only focuses on a single domain or task. In this work, we propose a graph-based distillation method that incorporates rich privileged information from a large multi-modal dataset in the source domain, and shows an improved performance in the target domain where data is scarce. Leveraging both a large-scale dataset and its extra modalities, our method learns a better model for temporal action detection and action classification without needing to have access to these modalities during test time. We evaluate our approach on action classification and temporal action detection tasks, and show that our models achieve the state-of-the-art performance on the PKU-MMD and NTU RGB+D datasets.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络