Sign language recognition via dimensional global–local shift and cross-scale aggregation

NEURAL COMPUTING & APPLICATIONS(2023)

引用 1|浏览13
暂无评分
摘要
Sign languages generally consist of a sequence of upper body gestures and are cooperative processes among various parts such as the hands, arms, and face. Therefore, the dynamics of the parts as well as the holistic appearance of the upper body and individual parts are essential for robust recognition. In this paper, a global–local representation (GLR) module is proposed to boost the spatiotemporal feature modeling. The GLR module is composed of global shift and local shift along the height, width, and temporal dimensions. Specifically, the global shift is applied to the entire feature map for holistic representation, while the local shift restricts itself to local patches to capture detailed features. Furthermore, a novel cross-scale aggregation module is designed to combine the global and local information in different dimensions. Extensive experimental results on three large-scale benchmarks, including WLASL, INCLUDE and LSA64, demonstrate that the proposed method achieves state-of-the-art recognition performance.
更多
查看译文
关键词
Sign language recognition,Global-local representation,Shift operation,Cross-scale aggregation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要