Measuring Dependence between Events
arxiv(2024)
摘要
Measuring dependence between two events, or equivalently between two binary
random variables, amounts to expressing the dependence structure inherent in a
2× 2 contingency table in a real number between -1 and 1. Countless
such dependence measures exist, but there is little theoretical guidance on how
they compare and on their advantages and shortcomings. Thus, practitioners
might be overwhelmed by the problem of choosing a suitable measure. We provide
a set of natural desirable properties that a proper dependence measure should
fulfill. We show that Yule's Q and the little-known Cole coefficient are
proper, while the most widely-used measures, the phi coefficient and all
contingency coefficients, are improper. They have a severe attainability
problem, that is, even under perfect dependence they can be very far away from
-1 and 1, and often differ substantially from the proper measures in that
they understate strength of dependence. The structural reason is that these are
measures for equality of events rather than of dependence. We derive the (in
some instances non-standard) limiting distributions of the measures and
illustrate how asymptotically valid confidence intervals can be constructed. In
a case study on drug consumption we demonstrate how misleading conclusions may
arise from the use of improper dependence measures.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要