MultPAX: Keyphrase Extraction Using Language Models and Knowledge Graphs.

IEEE International Semantic Web Conference(2022)

引用 1|浏览5
暂无评分
摘要
Keyphrase extraction aims to identify a small set of phrases that best describe the content of text. The automatic generation of keyphrases has become essential for many natural language applications such as text categorization, indexing, and summarization. In this paper, we propose MULTPAX, a multitask framework for extracting present and absent keyphrases using pre-trained language models and knowledge graphs. In particular, our framework contains three components: first, MULTPAX identifies present keyphrases from an input document. Then, MULTPAX links with external knowledge graphs to get more relevant phrases. Finally, MULTPAX ranks the extracted phrases based on their semantic relatedness to the input document and return top-k phrases as a final output. We conducted several experiments on four benchmark datasets to evaluate the performance of MULTPAX against different state-of-the-art baselines. The evaluation results demonstrate that our approach significantly outperforms the state-of-the-art baselines, with a significance t-test p < 0.041. Our source code and datasets are public available at https://github.com/dice-group/MultPAX.
更多
查看译文
关键词
Present keyphrase extraction,Absent keyphrase generation,Knowledge graph,Pre-trained language models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要