UNION: An Unreferenced Metric for Evaluating Open ended Story Generation
EMNLP 2020, pp. 9157-9166, 2020.
Extensive experiments show that UNION outperforms stateof-the-art metrics in terms of correlation with human judgments on two story datasets, and is more robust to dataset drift and quality drift
Despite the success of existing referenced metrics (e.g., BLEU and MoverScore), they correlate poorly with human judgments for open-ended text generation including story or dialog generation because of the notorious one-to-many issue: there are many plausible outputs for the same input, which may differ substantially in literal or semanti...More
PPT (Upload PPT)