Multi-modal fusion for robust hand gesture recognition based on heterogeneous networks

SCIENCE CHINA-TECHNOLOGICAL SCIENCES(2023)

引用 0|浏览5
暂无评分
摘要
Hand gesture recognition has become a vital subject in the fields of human-computer interaction and rehabilitation assessment. This paper presents a multi-modal fusion for hand gesture recognition (MFHG) model, which uses two heterogeneous networks to extract and fuse the features of the vision-based motion signals and the surface electromyography (sEMG) signals, respectively. To extract the features of the vision-based motion signals, a graph neural network, named the cumulation graph attention (CGAT) model, is first proposed to characterize the prior knowledge of motion coupling between finger joints. The CGAT model uses the cumulation mechanism to combine the early and late extracted features to improve motion-based hand gesture recognition. For the sEMG signals, a time-frequency convolutional neural network model, named TF-CNN, is proposed to extract both the signals’ time-domain and frequency-domain information. To improve the performance of hand gesture recognition, the deep features from multiple modes are merged with an average layer, and then the regularization items containing center loss and the mutual information loss are employed to enhance the robustness of this multi-modal system. Finally, a data set containing the multi-modal signals from seven subjects on different days is built to verify the performance of the multi-modal model. The experimental results indicate that the MFHG can reach 99.96% and 92.46% accuracy on hand gesture recognition in the cases of within-session and cross-day, respectively.
更多
查看译文
关键词
robust hand gesture recognition,heterogeneous networks,fusion,multi-modal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要