Crystal:Employer Name Normalization in the Online Recruitment Industry

KDD(2016)

引用 11|浏览36
暂无评分
摘要
Entity linking links entity mentions in text to the corresponding entities in a knowledge base (KB) and has many applications in both open domain and specific domains. For example, in the recruitment domain, linking employer names in job postings or resumes to entities in an employer KB is very important to many business applications. In this paper, we focus on this employer name normalization task, which has several unique challenges: handling employer names from both job postings and resumes, leveraging the corresponding location context, and handling name variations, irrelevant input data, and noises in the KB. We present a system called CompanyDepot which contains a machine learning based approach CompanyDepot-ML and a heuristic approach CompanyDepot-H to address these challenges in three steps: (1) searching for candidate entities based on a customized search engine for the KB; (2) ranking the candidate entities using learning-to-rank methods or heuristics; and (3) validating the top-ranked entity via binary classification or heuristics. While CompanyDepot-ML shows better extendability and flexibility, CompanyDepot-H serves as a strong baseline and useful way to collect training data for CompanyDepot-ML. The proposed system achieves 2.5%-21.4% higher coverage at the same precision level compared to an existing system used at CareerBuilder over multiple real-world datasets. Applying the system to a similar task of academic institution name normalization further shows the generalization ability of the method.
更多
查看译文
关键词
employer name normalization,entity linking,named entity disambiguation,learning to rank
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要