Dense Dilated Network for Few Shot Action Recognition.

ICMR '18: International Conference on Multimedia Retrieval Yokohama Japan June, 2018(2018)

引用 33|浏览145
暂无评分
摘要
Recently, video action recognition has been widely studied. Training deep neural networks requires a large amount of well-labeled videos. On the other hand, videos in the same class share high-level semantic similarity. In this paper, we introduce a novel neural network architecture to simultaneously capture local and long-term spatial temporal information. The dilated dense network is proposed with the blocks being composed of densely-connected dilated convolutions layers. The proposed framework is capable of fusing each layer's outputs to learn high-level representations, and the representations are robust even with only few training snippets. The aggregations of dilated dense blocks are also explored. We conduct extensive experiments on UCF101 and demonstrate the effectiveness of our proposed method, especially with few training examples.
更多
查看译文
关键词
Action recognition, dilated convolution, few-shot learning, video analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要