CaSMoS: A Framework for Learning Candidate Selection Models over Structured Queries and Documents

KDD(2016)

引用 41|浏览85
暂无评分
摘要
User experience at social media and web platforms such as LinkedIn is heavily dependent on the performance and scalability of its products. Applications such as personalized search and recommendations require real-time scoring of millions of structured candidate documents associated with each query, with strict latency constraints. In such applications, the query incorporates the context of the user (in addition to search keywords if present), and hence can become very large, comprising of thousands of Boolean clauses over hundreds of document attributes. Consequently, candidate selection techniques need to be applied since it is infeasible to retrieve and score all matching documents from the underlying inverted index. We propose CaSMoS, a machine learned candidate selection framework that makes use of Weighted AND (WAND) query. Our framework is designed to prune irrelevant documents and retrieve documents that are likely to be part of the top-k results for the query. We apply a constrained feature selection algorithm to learn positive weights for feature combinations that are used as part of the weighted candidate selection query. We have implemented and deployed this system to be executed in real time using LinkedIn's Galene search platform. We perform extensive evaluation with different training data approaches and parameter settings, and investigate the scalability of the proposed candidate selection model. Our deployment of this system as part of LinkedIn's job recommendation engine has resulted in significant reduction in latency (up to 25%) without sacrificing the quality of the retrieved results, thereby paving the way for more sophisticated scoring models.
更多
查看译文
关键词
Candidate selection,Learning query models,Personalized search and recommendation systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要