Dynamic Korean Sign Language Recognition Using Pose Estimation Based and Attention-Based Neural Network

Jungpil Shin,Abu Saleh Musa Miah, Kota Suzuki,Koki Hirooka,Md. Al Mehedi Hasan

IEEE ACCESS（2023）

引用 0|浏览0

暂无评分

摘要

Sign language recognition is crucial for improving communication accessibility for the hearing impaired community and reducing dependence on human interpreters. Notably, while significant research efforts have been devoted to many prevalent languages, Korean Sign Language (KSL) remains relatively underexplored, particularly concerning dynamic signs and generalizability. The scarcity of KSL datasets has exacerbated this limitation, hindering progress. Furthermore, most KSL research predominantly relies on static image-based datasets for recognition, leading to diminished accuracy and the inability to detect dynamic sign words. Furthermore, most KSL research predominantly relies on static image-based datasets for recognition, leading to diminished accuracy and the inability to detect dynamic sign words. Additionally, existing KSL recognition systems grapple with suboptimal performance accuracy and heightened computational complexity, further emphasizing the existing research gap. To address these formidable challenges, we propose a robust dynamic KSL recognition system that combines a skeleton-based Graph Convolution network with an attention-based neural network, effectively bridging the gap. Our solution employs a two-stream deep learning network to navigate the intricacies of dynamic signs, enhancing accuracy by effectively handling non-connected joint skeleton features. In this system, the first stream meticulously processes 47 pose landmarks using the Graph Convolutional Network (GCN) to extract graph-based features. These features are meticulously refined through a channel attention module and a general CNN, enhancing their temporal context. Concurrently, the second stream focuses on joint motion-based features, employing a similar approach. Subsequently, these distinct features from both streams are harmoniously integrated and channelled through a classification module to achieve precise sign-word recognition. A significant contribution of our work lies in creating a novel KSL video dataset, addressing the scarcity of data in this domain. This dataset comprises comprehensive information, including skeletal data from 47 joint skeleton points and details from both hands, body, and facial expressions. Our dataset aims to fill a critical gap in KSL research and provides a solid foundation for more extensive and inclusive studies in the field. Through this innovative approach, we aim to contribute significantly to the field of KSL recognition, filling the gaps in dynamic sign recognition and bolstering the accessibility of sign language communication within the Korean hearing impaired community and beyond. Our evaluations on a benchmark KSL-77 dataset and our proprietary lab dataset resulted in recognition accuracies of 99.87% and 100%, respectively. These results highlight the superiority of our model in the KSL recognition domain, outperforming existing models in terms of accuracy and computational efficiency.

查看译文

关键词

Assistive technologies,Gesture recognition,Hidden Markov models,Deep learning,Support vector machines,Computational modeling,Skeleton,Sign language,Graph neural networks,Convolutional neural networks,Machine learning,Dynamic hand gesture recognition,Korean sign language (KSL),graph convolutional network (GCN),general convolutional neural network (GCNN),machine learning,hand skeleton points,deep learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要