Estimation of subsidiary performance metrics under optimal policies
arxiv(2024)
摘要
In policy learning, the goal is typically to optimize a primary performance
metric, but other subsidiary metrics often also warrant attention. This paper
presents two strategies for evaluating these subsidiary metrics under a policy
that is optimal for the primary one. The first relies on a novel margin
condition that facilitates Wald-type inference. Under this and other regularity
conditions, we show that the one-step corrected estimator is efficient. Despite
the utility of this margin condition, it places strong restrictions on how the
subsidiary metric behaves for nearly optimal policies, which may not hold in
practice. We therefore introduce alternative, two-stage strategies that do not
require a margin condition. The first stage constructs a set of candidate
policies and the second builds a uniform confidence interval over this set. We
provide numerical simulations to evaluate the performance of these methods in
different scenarios.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要