Large-scale collective entity matching

Clinical Orthopaedics and Related Research(2011)

引用 161|浏览126
暂无评分
摘要
There have been several recent advancements in Machine Learning community on the Entity Matching (EM) problem. However, their lack of scalability has prevented them from being applied in practical settings on large real-life datasets. Towards this end, we propose a principled framework to scale any generic EM algorithm. Our technique consists of running multiple instances of the EM algorithm on small neighborhoods of the data and passing messages across neighborhoods to construct a global solution. We prove formal properties of our framework and experimentally demonstrate the effectiveness of our approach in scaling EM algorithms.
更多
查看译文
关键词
formal property,practical setting,principled framework,entity matching,large real-life datasets,global solution,multiple instance,machine learning community,em algorithm,generic em algorithm,large-scale collective entity matching
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要