Dynamic link-based ranking over large-scale graph- structured data

Dynamic link-based ranking over large-scale graph-structured data(2010)

引用 23|浏览23
暂无评分
摘要
Information Retrieval techniques have been the primary means of keyword search in document collections. However, as the amount and the diversity of available semantic connections between objects increase, link-based ranking methods including ObjectRank have been proposed to provide high-recall semantic keyword search over graph-structured data. Since a wide variety of data sources can be modeled as data graphs, supporting keyword search over graph-structured data greatly improves the usability of such data sources. However, it is challenging in both online performance and result quality. We first address the performance issue of dynamic authority-based ranking methods such as personalized PageRank and ObjectRank. Since they dynamically rank nodes in a data graph using an expensive matrix-multiplication method, the online execution time rapidly increases as the size of data graph grows. Over the English Wikipedia dataset of 2007, ObjectRank spends 20-40 seconds to compute query-specific relevance scores, which is unacceptable. We introduce a novel approach, BinRank, that approximates dynamic link-based ranking scores efficiently. BinRank partitions a dictionary into bins of relevant keywords and then constructs materialized subgraphs (MSGs) per bin in preprocessing stage. In query time, to produce highly accurate top-K results efficiently, BinRank uses the MSG corresponding to the given keyword, instead of the original data graph. PageRank and ObjectRank calculate the global importance score and the query-specific authority score of each node respectively by exploiting the link structure of a given data graph. However, both measures favor nodes with high in-degree that may contain popular yet generic content, and thus those nodes are frequently included in top-K lists, regardless of given query. We propose a novel ranking measure, Inverse ObjectRank, which measures the content-specificity of each node by traversing the semantic links in the data graph in the reverse direction. Then, we allow users to adjust the importance of the three ranking measures (global importance, query-relevance, and content-specificity) to improve the quality of search results.
更多
查看译文
关键词
graph-structured data,data graph,original data graph,link-based ranking method,keyword search,large-scale graph-structured data,dynamic authority-based ranking method,dynamic link-based ranking score,Inverse ObjectRank,novel ranking measure,data source
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要