Automatic Annotation Of Voice Forum Content For Rural Users And Evaluation Of Relevance

PROCEEDINGS OF THE 1ST ACM SIGCAS CONFERENCE ON COMPUTING AND SUSTAINABLE SOCIETIES (COMPASS 2018)(2018)

引用 5|浏览12
暂无评分
摘要
Voice forums are an effective intervention medium for marginalized communities to access information in a structured and localized manner. Users actively contribute by posting questions and responses in the form of audio messages, and thereby help in enriching the voice forum content. In order to build an audio library using the voice forums to disseminate information, significant manual effort is needed in analyzing and curating the data. This is one of the key impediments to the successful implementation of voice forums for knowledge dissemination and training.In this paper, we explore the effectiveness of automated approaches to analyze and curate voice forum content in Hindi, a native language in the northern part of India. We study the use of standard techniques such as topic modeling and extractive summarization on Hindi speech transcripts (with WER of 67%) to cluster audios thematically and create summaries for individual audios respectively. These curated audios are used to build an IVR-based library for community health workers in rural India. We evaluated the relevance and preference of the automated annotation using a field trail. We find that the relevance perception varied between human and automatically generated annotations, but automatically generated summaries were still found to be useful to access the voice forum audios.
更多
查看译文
关键词
HCI4D, ICT4D, Interactive Voice Response, IVR, Community Health Workers, Topic Modeling, Speech Summarization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要