Omissions and inferential meaning-making in audio description, and implications for automating video content description

UNIVERSAL ACCESS IN THE INFORMATION SOCIETY(2023)

引用 0|浏览1
暂无评分
摘要
There is broad consensus that audio description (AD) is a modality of intersemiotic translation, but there are different views in relation to how AD can be more precisely conceptualised. While Benecke (Audiodeskription als partielle Translation. Modell und Methode, LIT, Berlin, 2014) characterises AD as ‘partial translation’, Braun (T 28: 302–313, 2016) hypothesises that what audio describers appear to ‘omit’ from their descriptions can normally be inferred by the audience, drawing on narrative cues from dialogue, mise-en-scène, kinesis, music or sound effects. The study reported in this paper tested this hypothesis using a corpus of material created during the H2020 MeMAD project. The MeMAD project aimed to improve access to audiovisual (AV) content through a combination of human and computer-based methods of description. One of the MeMAD workstreams addressed human approaches to describing visually salient cues. This included an analysis of the potential impact of omissions in AD, which is the focus of this paper. Using a corpus of approximately 500 audio described film extracts we identified the visual elements that can be considered essential for the construction of the filmic narrative and then performed a qualitative analysis of the corresponding audio descriptions to determine how these elements are verbally represented and whether any omitted elements could be inferred from other cues that are accessible to visually impaired audiences. We then identified the most likely source of these inferences and the conditions upon which retrieval could be predicated, preparing the ground for future reception studies to test our hypotheses with target audiences. In this paper, we discuss the methodology used to determine where omissions occur in the analysed audio descriptions, consider worked examples from the MeMAD500 film corpus, and outline the findings of our study namely that various strategies are relevant to inferring omitted information, including the use of proximal and distal contextual cues, and reliance on the application of common knowledge and iconic scenarios. To conclude, consideration is given to overcoming significant omissions in human-generated AD, such as using extended AD formats, and mitigating similar gaps in machine-generated descriptions, where incorporating dialogue analysis and other supplementary data into the computer model could resolve many omissions.
更多
查看译文
关键词
Audio description,Video description,Video captioning,Media accessibility,Audiovisual translation,Automation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要