Motion Fused Frames: Data Level Fusion Strategy for Hand Gesture Recognition

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)(2018)

引用 136|浏览56
暂无评分
摘要
Acquiring spatio-temporal states of an action is the most crucial step for action classification. In this paper, we propose a data level fusion strategy, Motion Fused Frames (MFFs), designed to fuse motion information into static images as better representatives of spatio-temporal states of an action. MFFs can be used as input to any deep learning architecture with very little modification on the network. We evaluate MFFs on hand gesture recognition tasks using three video datasets - Jester, ChaLearn LAP IsoGD and NVIDIA Dynamic Hand Gesture Datasets - which require capturing long-term temporal relations of hand movements. Our approach obtains very competitive performance on Jester and ChaLearn benchmarks with the classification accuracies of 96.28% and 57.4%, respectively, while achieving state-of-the-art performance with 84.7% accuracy on NVIDIA benchmark.
更多
查看译文
关键词
Motion Fused Frames,data level fusion strategy,action classification,motion information,deep learning architecture,hand gesture recognition tasks,NVIDIA Dynamic Hand Gesture Datasets,hand movements,MFF,video datasets,long-term temporal relations,spatiotemporal states,ChaLearn LAP IsoGD,Jester
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要