Domain Adaptation and Summary Distillation for Unsupervised Query Focused Summarization

Jiancheng Du,Yang Gao

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING(2024)

引用 0|浏览2
暂无评分
摘要
Text summarizing is the task of reducing a document's length while maintaining its essential information. In the age of information explosion, how to obtain the content that users needed from a large volume of information becomes particularly significant. Under such circumstances, query-focused abstractive summarization (qfs) becomes more dominant since it is able to focus on user needs while delivering fluent, concise, succinct paraphrased summaries. However, unlike generic summarization, which has achieved remarkable progress driven by a substantial amount of parallel data, the qfs struggles due to a deficiency of parallel corpus. Therefore, in this paper, we leverage a typical large generic summarization dataset to facilitate the pressing demands on unsupervised qfs. The large-scale query-free benchmark is automatically transformed into a query-focused dataset (Query-CNNDM) while preserving its informative summaries. We propose a simple yet effective unsupervised method, called Domain Adaptation and Summary Distillation method (DASD). In the model, to achieve the domain adaptation for unsupervised qfs, we design a query-aware gap sentence generation (q-GSG) strategy to equip the model with the capability of learning target textual knowledge and obtaining a good initialization at the target domain. As instance-specific regularization, we train a teacher model with the Query-CNNDM to generate pseudo-labels for summary distillation. Experimental results indicate that our DASD model achieves state-of-the-art performance on two benchmark datasets, Debatepedia and Wikiref, in a zero-shot setting and shows good generalization to the abstractive few-shot qfs.
更多
查看译文
关键词
Adaptation models,Task analysis,Data models,Training,Benchmark testing,Predictive models,Question answering (information retrieval),Abstractive summarization,domain adaptation,query-focused summarization,summary distillation,unsupervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要