Experts community memory for entity similarity functions recommendation.

Information Sciences(2017)

引用 3|浏览11
暂无评分
摘要
Similarity search (or similar entity search) is the process of finding all entities similar to a given entity (e.g., a person, a document, or an image). Although many techniques for similarity analysis have been proposed in the past, little work has been done on the question of which of the presented techniques are most suitable for a given similarity search task. Knowing the right similarity function is important as the task is highly domain- and data-dependent. In this article, we provide an approach for recommending which similarity functions (e.g., edit distance or jaccard similarity) should be used for measuring the similarity between two entities. The approach employs an incremental knowledge acquisition technique for capturing domain experts’ knowledge about similarity functions and their usage contexts (e.g., entity class, attribute name and some keywords). In addition, for situations where domain experts have little or no knowledge about datasets, we analyze the features of the datasets and then suggest similarity functions based on the identified features. We also demonstrate the feasibility and effectiveness of our proposed approach on several real-world datasets from different domains.
更多
查看译文
关键词
Entity matching,Similarity function,Incremental knowledge acquisition,Similarity search,Similarity measure,Recommendation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要