A Novel Effective Combinatorial Framework for Sign Language Translation

2023 2nd International Conference on Big Data, Information and Computer Network (BDICN)(2023)

引用 0|浏览20
Sign language is a process by which people with speech and hearing disabilities to communicate with the world. It uses body movements to simulate syllables and form corresponding words to convey information. With the introduction of the CLIP model, which makes multi-modal tasks possible, there is a new solution for sign language recognition and translation. This paper proposes a novel framework for sign language translation, CL-MarianMT, which can effectively combine a sign language recognition model with a SOTA model for translation. In the part of sign language recognition, the video feature vectors are extracted by the Video Encoder of CLIP4clip architecture, and then the feature vectors are input into the Transformer model, and the recognized sentences are sent into the fine-tuned translation model MarianMT to realize the translation from English to Chinese. The experiment shows that the fine-tuned MarianMT improves the Chinese translation ability of sign language. The study in this paper can serve the purpose of enabling deaf-mute people to communicate in different languages and facilitate hearing-impaired people in language communication. At the same time, it can also allow hearing-impaired people to communicate with people with different language systems, which have a certain social value.
component,CLIP4clip,MarianMT,Multimodal,Multitask,Sign Language Ttranslation
AI 理解论文
Chat Paper