STAR: A general interactive framework for FDR control under structural constraints

arXiv: Methodology(2020)

引用 28|浏览84
暂无评分
摘要
We propose a general framework based on textit{selectively traversed accumulation rules} (STAR) for interactive human-in-the-loop multiple testing with generic structural constraints on the rejection set. STAR combines accumulation tests from ordered multiple testing with data-carving ideas from post-selection inference, allowing for highly flexible adaptation to generic structural information. Given independent $p$-values for each of $n$ null hypotheses, STAR defines an iterative protocol for gradually pruning a candidate rejection set, beginning with $mathcal{R}_0 = [n]$ and shrinking with each step. At step $t$, the analyst estimates the false discovery proportion (FDP) of the current rejection set $mathcal{R}_t$, and halts and rejects every $H_i$ with $i mathcal{R}_t$ if $mathrm{FDP}_t leq alpha$. Otherwise, the analyst may shrink the rejection set to $mathcal{R}_{t+1}subseteq mathcal{R}_t$ however she wants, provided the choice depends only on partially masked $p$-values $g(p_i)$ for $iin mathcal{R}_t$, as well as unmasked $p$-values $p_i$ for $inotin mathcal{R}_t$. Typically, the choice will be based on eliminating the least promising hypothesis from $mathcal{R}_t$, after estimating a model from the observable data. By restricting the information available to the analyst, our iterative protocol guarantees exact false discovery rate (FDR) control at level $alpha$ in finite samples, for any data-adaptive update rule the analyst may choose. We suggest heuristic update rules for a variety of applications with complex structural constraints, show that STAR performs well for problems ranging from convex region detection and bump-hunting to FDR control on trees and DAGs, and show how to extend STAR to regression problems where knockoff statistics are available in lieu of $p$-values.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要