Query Reorganization Algorithms for Efficient Boolean Information Filtering.

IEEE Trans. Knowl. Data Eng.(2017)

引用 8|浏览45
暂无评分
摘要
In the information filtering paradigm, clients subscribe to a server with continuous queries that express their information needs and get notified every time appropriate information is published. To perform this task in an efficient way, servers employ indexing schemes that support fast matches of the incoming information with the query database. Such indexing schemes involve (i) main-memory trie-based data structures that cluster similar queries by capturing common elements between them and (ii) efficient filtering mechanisms that exploit this clustering to achieve high throughput and low filtering times. However, state-of-the-art indexing schemes are sensitive to the query insertion order and cannot adopt to an evolving query workload, degrading the filtering performance over time. In this paper, we present an adaptive trie-based algorithm that outperforms current methods by relying on query statistics to reorganise the query database. Contrary to previous approaches, we show that the nature of the constructed tries, rather than their compactness, is the determining factor for efficient filtering performance. Our algorithm does not depend on the order of insertion of queries in the database, manages to cluster queries even when clustering possibilities are limited, and achieves more than 96 percent filtering time improvement over its state-of-the-art competitors. Finally, we demonstrate that our solution is easily extensible to multi-core machines.
更多
查看译文
关键词
Indexing,Clustering algorithms,Servers,Data models,Vegetation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要