Joint Multi-Field Siamese Recurrent Neural Network For Entity Resolution

PRICAI 2018: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II(2018)

引用 0|浏览68
暂无评分
摘要
Entity resolution which deals with determining whether two records refer to the same entity has a wide range of applications in both data cleaning and integration. Traditional approaches focus on using string metrics to calculate the matching scores of recorded pairs or employing the machine learning technique with hand-crafted features. However, the effectiveness of these methods largely depends on designing good domain-specific metric methods or extracting discriminative features with rich domain knowledge. Also, traditional learning-based methods usually ignore the discrepancy between citation's fields. In this paper, to decrease the impact of information gaps between different fields and fully take advantage of semantical and contextual information in each field, we present a novel joint multi-field siamese recurrent architecture. In particular, our method employs word-based Long Short-Term Memory (LSTM) for the fields with the strong relevance between each word and character-based Recurrent Neural Network (RNN) for the fields with the weak relevance between each word, which can exploit each field's temporal information effectively. Experimental results on three datasets demonstrate that our model can learn discriminative features and outperforms several baseline methods and other RNN-based methods.
更多
查看译文
关键词
Entity resolution, Joint multi-field siamese architecture, Recurrent Neural Network, Long Short-Term Memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要