ModelingWord Importance in Conversational Transcripts: Toward improved live captioning for Deaf and hard of hearing viewers

Akhter Al Amin,Saad Hassan,Matt Huenerfauth,Cecilia Ovesdotter Alm

W4A（2023）

引用 0|浏览17

暂无评分

摘要

Despite the recent improvements in automatic speech recognition (ASR) systems, their accuracy is imperfect in live conversational settings. Classifying the importance of each word in a caption transcription can enable evaluation metrics that best reflect Deaf and Hard of Hearing (DHH) readers' judgment of the caption quality. Prior work has proposed using word embeddings, e.g., word2vec or BERT embeddings, to model word importance in conversational transcripts. Recent work also disseminated a human-annotated word importance dataset. We conducted a word-token level analysis on this dataset and explored Part-of-Speech (POS) distribution. We then augmented the dataset with POS tags and reduced the class imbalance by generating 5% additional text using masking. Finally, we investigated how various supervised models learn the importance of words. The best performing model trained on our augmented dataset performed better than prior models. Our findings can inform the design of a metric for measuring live caption quality from DHH users' perspectives.

查看译文

关键词

Accessibility,Word Importance,Augmentation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要