Heterogeneous attention based transformer for sign language translation

Hao Zhang, Yixiang Sun, Zenghui Liu,Qiyuan Liu,Xiyao Liu,Ming Jiang, Gerald Schafer,Hui Fang

APPLIED SOFT COMPUTING（2023）

引用 0|浏览2

暂无评分

摘要

Sign language translation (SLT) has attracted significant interest both from research and industry, enabling convenient communications with the deaf-mute community. While recent transformer-based models have shown improved sign translation performance, it is still under-explored how to design an efficient transformer-based deep network architecture that effectively extracts joint visual-text features by exploiting multi-level spatial and temporal contextual information. In this paper, we propose heterogeneous attention based transformer(HAT), a novel SLT model to generate attentions from diverse spatial and temporal contextual levels. Specifically, the proposed light dual-stream sparse attention-based module yields more effective visual-text representations compared to conventional transformers. Extensive experiments demonstrate that our HAT achieves state-of-the-art performance on the challenging PHOENIX2014T benchmark dataset with a BLEU-4 score of 25.33 on the test set.(c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

查看译文

关键词

Sign language translation,Transformer,Attention-based models,Dual-sparse attention,PHOENIX2014T

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要