Finding Diverse Neighbors in High Dimensional Space
2018 IEEE 34th International Conference on Data Engineering (ICDE)(2018)
摘要
Given a d-dimensional point query q, finding data items similar to q is a crucial task in many information retrieval and data mining applications. The typical approach is to find K items in a data set most similar to q, known as K nearest neighbors. Often, it is valuable to avoid too many answers that are too similar, and the importance of diversity has been considered in recent research. There are many different ways to characterize diversity, most of which depend on a notion of distance between points. In this paper, we propose a novel view of diversity based on spatial angles. This approach captures relevant and diverse results surrounding q from distinct directions even in high dimensional space. We present several algorithms to compute the diverse neighbor set, and show that it has several desirable properties. Extensive experiments demonstrate the effectiveness and efficiency of our methods on both real and synthetic data sets.
更多查看译文
关键词
diverse nearest neighbor search,angular diversity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络