Entity Set Expansion from Twitter.

ICTIR(2018)

引用 22|浏览30
暂无评分
摘要
Online social media yields a large-scale corpora which is fairly informative and sometimes includes many up-to-date entities. The challenging task of expanding entity sets on social media text is to extract more uncommon entities only using several seeds already in hand. In this paper, we present an approach which is able to find novel entities by expanding a small initial seed set on Twitter text. Our method first generates candidate sets on the basis of the semantic similarity feature. Then it jointly utilizes 2 text-based features and other 12 ones which carry social media specific information. With the scores on those features, a ranking model is learned by a supervised algorithm to synthetically score each candidate terms and then the final ranked list is taken as the target expanded set. We do experiments with 24 entity classes on the Twitter corpus and in the expanded sets there come many novel entities which have not been completely detected in previous researches. And the experimental results on the datasets of different years can perfectly consist with the objective law that fresh entities change as time goes on.
更多
查看译文
关键词
Social media mining, information extraction, entity set expansion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要