A Sequential Test for Selecting the Better Variant - Online A/B testing, Adaptive Allocation, and Continuous Monitoring.

WSDM(2019)

引用 23|浏览162
暂无评分
摘要
Online A/B tests play an instrumental role for Internet companies to improve products and technologies in a data-driven manner. An online A/B test, in its most straightforward form, can be treated as a static hypothesis test where traditional statistical tools such as p-values and power analysis might be applied to help decision makers determine which variant performs better. However, a static A/B test presents both time cost and the opportunity cost for rapid product iterations. For time cost, a fast-paced product evolution pushes its shareholders to consistently monitor results from online A/B experiments, which usually invites peeking and altering experimental designs as data collected. It is recognized that this flexibility might harm statistical guarantees if not introduced in the right way, especially when online tests are considered as static hypothesis tests. For opportunity cost, a static test usually entails a static allocation of users into different variants, which prevents an immediate roll-out of the better version to larger audience or risks of alienating users who may suffer from a bad experience. While some works try to tackle these challenges, no prior method focuses on a holistic solution to both issues. In this paper, we propose a unified framework utilizing sequential analysis and multi-armed bandit to address time cost and the opportunity cost of static online tests simultaneously. In particular, we present an imputed sequential Girshick test that accommodates online data and dynamic allocation of data. The unobserved potential outcomes are treated as missing data and are imputed using empirical averages. Focusing on the binomial model, we demonstrate that the proposed imputed Girshick test achieves Type-I error and power control with both a fixed allocation ratio and an adaptive allocation such as Thompson Sampling through extensive experiments. In addition, we also run experiments on historical Etsy.com A/B tests to show the reduction in opportunity cost when using the proposed method.
更多
查看译文
关键词
controlled experiments, imputed sequential analysis, online a/b tests, sequential analysis, thompson sampling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要