Lemp: Fast Retrieval Of Large Entries In A Matrix Product

SIGMOD/PODS'15: International Conference on Management of Data Melbourne Victoria Australia May, 2015(2015)

引用 64|浏览47
暂无评分
摘要
We study the problem of efficiently retrieving large entries in the product of two given matrices, which arises in a number of data mining and information retrieval tasks. We focus on the setting where the two input matrices are tall and skinny, i.e., with millions of rows and tens to hundreds of columns. In such settings, the product matrix is large and its complete computation is generally infeasible in practice. To address this problem, we propose the LEMP algorithm, which efficiently retrieves only the large entries in the product matrix without actually computing it. LEMP maps the large-entry retrieval problem to a set of smaller cosine similarity search problems, for which existing methods can be used. We also propose novel algorithms for cosine similarity search, which are tailored to our setting. Our experimental study on large real-world datasets indicates that LEMP is up to an order of magnitude faster than state-of-the-art approaches.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要