BiasBuster: a Neural Approach for Accurate Estimation of Population Statistics using Biased Location Data
CoRR(2024)
摘要
While extremely useful (e.g., for COVID-19 forecasting and policy-making,
urban mobility analysis and marketing, and obtaining business insights),
location data collected from mobile devices often contain data from a biased
population subset, with some communities over or underrepresented in the
collected datasets. As a result, aggregate statistics calculated from such
datasets (as is done by various companies including Safegraph, Google, and
Facebook), while ignoring the bias, leads to an inaccurate representation of
population statistics. Such statistics will not only be generally inaccurate,
but the error will disproportionately impact different population subgroups
(e.g., because they ignore the underrepresented communities). This has dire
consequences, as these datasets are used for sensitive decision-making such as
COVID-19 policymaking. This paper tackles the problem of providing accurate
population statistics using such biased datasets. We show that statistical
debiasing, although in some cases useful, often fails to improve accuracy. We
then propose BiasBuster, a neural network approach that utilizes the
correlations between population statistics and location characteristics to
provide accurate estimates of population statistics. Extensive experiments on
real-world data show that BiasBuster improves accuracy by up to 2 times in
general and up to 3 times for underrepresented populations.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要