Scalable Supervised Dimensionality Reduction Using Clustering

KDD' 13: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Chicago Illinois USA August, 2013(2013)

引用 16|浏览134
暂无评分
摘要
The automated targeting of online display ads at scale requires the simultaneous evaluation of a single prospect against many independent models. When deciding which ad to show to a user, one must calculate likelihood-to-convert scores for that user across all potential advertisers in the system. For modern machine-learning-based targeting, as conducted by Media6Degrees (m6d), this can mean scoring against thousands of models in a large, sparse feature space. Dimensionality reduction within this space is useful, as it decreases scoring time and model storage requirements. To meet this need, we develop a novel algorithm for scalable supervised dimensionality reduction across hundreds of simultaneous classification tasks. The algorithm performs hierarchical clustering in the space of model parameters from historical models in order to collapse related features into a single dimension. This allows us to implicitly incorporate feature and label data across all tasks without operating directly in a massive space. We present experimental results showing that for this task our algorithm outperforms other popular dimensionality-reduction algorithms across a wide variety of ad campaigns, as well as production results that showcase its performance in practice.
更多
查看译文
关键词
supervised dimensionality reduction,clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要