Entity-centric document filtering: boosting feature mapping through meta-features.

CIKM'13: 22nd ACM International Conference on Information and Knowledge Management San Francisco California USA October, 2013(2013)

引用 16|浏览49
暂无评分
摘要
This paper studies the entity-centric document filtering task -- given an entity represented by its identification page (e.g., an Wikpedia page), how to correctly identify its relevant documents. In particular, we are interested in learning an entity-centric document filter based on a small number of training entities, and the filter can predict document relevance for a large set of unseen entities at query time. Towards characterizing the relevance of a document, the problem boils down to learning keyword importance for the query entities. Since the same keyword will have very different importance for different entities, we abstract the entity-centric document filtering problem as a transfer learning problem, and the challenge becomes how to appropriately transfer the keyword importance learned from training entities to query entities. Based on the insight that keywords sharing some similar "properties" should have similar importance for their respective entities, we propose a novel concept of meta-feature to map keywords from different entities. To realize the idea of meta-feature-based feature mapping, we develop and contrast two different models, LinearMapping and BoostMapping. Experiments on three different datasets confirm the effectiveness of our proposed models, which show significant improvement compared with four state-of-the-art baseline methods.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要