Distributed Representations of Words to Guide Bootstrapped Entity Classifiers.
HLT-NAACL(2015)
摘要
Bootstrapped classifiers iteratively generalize from a few seed examples or prototypes to other examples of target labels. However, sparseness of language and limited supervision make the task difficult. We address this problem by using distributed vector representations of words to aid the generalization. We use the word vectors to expand entity sets used for training classifiers in a bootstrapped pattern-based entity extraction system. Our experiments show that the classifiers trained with the expanded sets perform better on entity extraction from four online forums, with 30% F1 improvement on one forum. The results suggest that distributed representations can provide good directions for generalization in a bootstrapping system.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络