Automated Context-Aware Phrase Mining from Text Corpora.

DASFAA (2)(2021)

引用 1|浏览2
暂无评分
摘要
Phrase mining aims to automatically extract high-quality phrases from a given corpus, which serves as the essential step in transforming unstructured text into structured information. Existing statistic-based methods have achieved the state-of-the-art performance of this task. However, such methods often heavily rely on statistical signals to extract quality phrases, ignoring the effect of contextual information. In this paper, we propose a novel context-aware method for automated phrase mining, ConPhrase, which formulates phrase mining as a sequence labeling problem with consideration of contextual information. Meanwhile, to tackle the global information scarcity issue and the noisy data filtration issue, our ConPhrase method designs two modules, respectively: 1) a topic-aware phrase recognition network that incorporates domain-related topic information into word representation learning for identifying quality phrases effectively. 2) an instance selection network that focuses on choosing correct sentences with reinforcement learning for further improving the prediction performance of phrase recognition network. Experimental results demonstrate that our ConPhrase outperforms the state-of-the-art approach.
更多
查看译文
关键词
Phrase mining,Quality phrase recognition,Information extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要