Better Graph Embeddings for Enterprise Graphs

Rajeev Gupta, Madhusudhanan Krishnamoorthy,Vipindeep Vangela

PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024(2024)

引用 0|浏览0
暂无评分
摘要
Graph embeddings are scalable and performant node representations in a graph. Fast Random Projections (FastRP) is claimed to be thousands of times faster to generate embeddings compared to random walk-based algorithms like DeepWalk and Node2Vec, while achieving comparable performance on various downstream tasks. In this paper, we consider graph embeddings for enterprise graphs-the communication graphs which are constructed using emails, meetings, documents, etc., as graph nodes which are connected if they share a common contact (person) or topic (keyword). We consider meeting-to-email, meeting-to-document, and email-to-document recommendations as the downstream tasks. If we use the FastRP algorithm to get relevant entities (e.g., relevant emails for a given meeting), we get a large number of false positives, i.e., emails with high embedding similarity with the meetings but not actually relevant as per the ground-truth. We present a modified FastRP algorithm where we delay the random projections to improve the relevance of the FastRP algorithm with slight additional cost. Specifically, using the real enterprise data, we show that embedding performance can be improved by more than 10% (Recall@N) with slight increase in the embedding generation cost. Currently, we are using this algorithm to generate embeddings of billions of entities for millions of Microsoft customers and powering a number of recommendation applications.
更多
查看译文
关键词
Enterprise graphs,graph embeddings,random projections,meeting intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要