High-Performance Unsupervised Relation Extraction from Large Corpora

Hong Kong(2006)

引用 38|浏览1
暂无评分
摘要
We present URIES -- an Unsupervised Relation Identification and Extraction system. The system automatically identifies interesting binary relations between entities in the input corpus, and then proceeds to extract a large number of instances of these relations. The system discovers relations by clustering frequently co-occuring pairs of entities, based on the contexts in which they appear. Its complex pattern-based representation of the contexts allows the clustering step to achieve very high precision, sufficient for the clusters to perform as sets of seeds for bootstrapping a high-recall relation extraction process. In a series of experiments we demonstrate the successful performance of URIES and compare it to the two existing systems -- a weakly supervised high-recall Web relation extraction system called SRES, and an unsupervised relation identification system that uses a simpler bag-of-words representation of contexts. The experiments show that URIES performs comparably to SRES, but without any supervision, and that such performance is due to the power of its complex contexts representation and to its novel candidate selection method.
更多
查看译文
关键词
existing system,clustering step,large corpora,interesting binary relation,high-recall relation extraction process,high-performance unsupervised relation extraction,high-recall web relation extraction,complex pattern-based representation,extraction system,unsupervised relation identification system,complex contexts representation,simpler bag-of-words representation,internet,bag of words,relation extraction,unsupervised learning,binary relation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要