## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Query clustering based on bid landscape for sponsored search auction optimization

KDD, pp.1150-1158, (2013)

EI

Keywords

Abstract

In sponsored search auctions, the auctioneer operates the marketplace by setting a number of auction parameters such as reserve prices for the task of auction optimization. The auction parameters may be set for each individual keyword, but the optimization problem becomes intractable since the number of keywords is in the millions. To red...More

Code:

Data:

Introduction

- Advertisers bid on keywords for advertising opportunities alongside algorithmic search results, through a generalized second-price auction (GSP) [6].
- An example of an auction parameter is reserve prices; only ads that clear the reserve price participate in the auction [12, 13].
- Another example is the exponent to which the CTR estimate is raised in the rank score function [9, 11].
- To reduce the dimensionality for a parsimonious model that generalizes well, one wishes to cluster keywords or queries into meaningful groups, and set parameters at the keyword-cluster level

Highlights

- In search advertising, advertisers bid on keywords for advertising opportunities alongside algorithmic search results, through a generalized second-price auction (GSP) [6]
- The auctioneer or the search engine operates the marketplace by setting a number of auction parameters, which play an important part in determining the outcome of the auction
- We examine some empirical bid distributions in sponsored search auctions in Section 3, to support the parametric assumption that each keyword is represented as a Gaussian mixture density
- With the learned Gaussian mixture model for keywords described in Section 3, we apply the variational EM algorithm described in Section 6 to cluster keywords into k partitions
- We have presented a formalism of clustering probability distributions, motivated by real-world applications where observations are drawn from underlying distributions and the goal is to cluster the underlying concepts with uncertainty
- The algorithm has been applied to the important problem of sponsored search auction optimization, and yielded significant improvement in click-through rate over k-means in offline simulation, and as well as improvement in revenue and clicks over the existing production system

Results

- With the learned GMMs for keywords described in Section 3, the authors apply the variational EM algorithm described in Section 6 to cluster keywords into k partitions.
- It is clear that the algorithm does not partition examples in the Euclidean sense, e.g., more clusters are derived in the low-variance area since those examples have greater impacts on the total loss in KL divergence (Eq (16)).
- Figure 3(b) illustrates how keywords are clustered, where each ball represents a keyword GMM and each same-color cloud forms a cluster.
- The clustering exhibits a meaningful yet nonEuclidean pattern, e.g., low-variance clusters are denser in belonging keywords

Conclusion

- The authors have presented a formalism of clustering probability distributions, motivated by real-world applications where observations are drawn from underlying distributions and the goal is to cluster the underlying concepts with uncertainty.
- The authors have derived the algorithms for clustering Gaussian densities and GMMs, while the underlying principle generalizes to other distributions such as beta distribution for binomially distributed data, Dirichlet distribution for multinomial data, and gamma distribution for Poisson data.
- The algorithm has been applied to the important problem of sponsored search auction optimization, and yielded significant improvement in CTR over k-means in offline simulation, and as well as improvement in revenue and clicks over the existing production system

- Table1: Auction optimization results with different clustering methods
- Table2: Online A/B testing results

Funding

- Presents a formalism of clustering probability distributions, and its application to query clustering where each query is represented as a probability density of click-through rate weighted bid and distortion is measured by KL divergence
- Develops an algorithm for clustering Gaussian mixture densities, which generalize a single Gaussian and are typically a more realistic parametric assumption for real-world data
- The main contribution of this paper is to present a formalism of clustering probability distributions
- Describes a query clustering algorithm where each query is represented as a probability density of CTR-weighted bid and distortion is measured by Kullback-Leibler divergence
- Examines some empirical bid distributions in sponsored search auctions in Section 3, to support the parametric assumption that each keyword is represented as a Gaussian mixture density

Reference

- A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh. Clustering with Bregman divergences. Journal of Machine Learning Research, 6:1705–1749, 2005.
- Y. Chen, M. Kapralov, D. Pavlov, and J. F. Canny. Factor modeling for advertisement targeting. Advances in Neural Information Processing Systems (NIPS 2009), 22:324–332, 2009.
- Y. Chen and T. W. Yan. Position-normalized click prediction in search advertising. Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2012), 2012.
- T. M. Cover and J. A. Thomas. Elements of Information Theory, page 26. Wiley-Interscience, 99th edition, 1991.
- M. N. Do. Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models. IEEE Signal Processing Letters, 10(4):115–118, 2003.
- B. Edelman, M. Ostrovsky, and M. Schwarz. Internet advertising and the generalized second price auction: selling billions of dollars worth of keywords. American Economic Review, 97(1):242–259, 2007.
- J. C. Gittins. Bandit processes and dynamic allocation indices. Journal of the Royal Statistical Society, Series B (Methodological), 41(2):148–177, 1979.
- G. V. Glass. Primary, secondary, and meta-analysis of research. Educational Researcher, 5(10):3–8, 1976.
- T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-scale Bayesian click-through rate prediction for sponsored search advertising in Microsoft’s Bing search engine. Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pages 13–20, 2010.
- J. R. Hershey and P. A. Olsen. Approximating the Kullback-Leibler divergence between Gaussian mixture models. 2007 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP 2007), 4:IV–317–IV–320, 2007.
- S. Lahaie and P. Mcafee. Efficient ranking in sponsored search. WINE 2011, LNCS 7090, pages 254–265, 2011.
- M. Ostrovsky and M. Schwarz. Reserve prices in internet advertising auctions: a field experiment. Stanford University Graduate School of Business Research Paper No. 2054, 2009.
- F. Pin and P. Key. Stochastic variability in sponsored search auctions: observations and models. Proceedings of the 12th ACM Conference on Electronic Commerce (EC 2011), pages 61–70, 2011.
- A. Slivkins. Multi-armed bandits on implicit metric spaces. Advances in Neural Information Processing Systems (NIPS 2011), 24:1602–1610, 2011.

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn