Benchmarks for Deep Off-Policy Evaluation
international conference on learning representations, 2021.
A benchmark proposal for off-policy evaluation and policy selection.
Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both obtaining and selecting complex policies for decision making. The ability to perform evaluation offline is particularly important in many real-world domains, such as healthcare, recommender systems, or robotics, where online data colle...更多
下载 PDF 全文