Select and Summarize: Scene Saliency for Movie Script Summarization
arxiv(2024)
摘要
Abstractive summarization for long-form narrative texts such as movie scripts
is challenging due to the computational and memory constraints of current
language models. A movie script typically comprises a large number of scenes;
however, only a fraction of these scenes are salient, i.e., important for
understanding the overall narrative. The salience of a scene can be
operationalized by considering it as salient if it is mentioned in the summary.
Automatically identifying salient scenes is difficult due to the lack of
suitable datasets. In this work, we introduce a scene saliency dataset that
consists of human-annotated salient scenes for 100 movies. We propose a
two-stage abstractive summarization approach which first identifies the salient
scenes in script and then generates a summary using only those scenes. Using
QA-based evaluation, we show that our model outperforms previous
state-of-the-art summarization methods and reflects the information content of
a movie more accurately than a model that takes the whole movie script as
input.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要