Towards objectively evaluating the quality of generated medical summaries

Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)(2021)

引用 0|浏览8
暂无评分
摘要
We propose a method for evaluating the quality of generated text by asking evaluators to count facts, and computing precision, recall, f-score, and accuracy from the raw counts. We believe this approach leads to a more objective and easier to reproduce evaluation. We apply this to the task of medical report summarisation, where measuring objective quality and accuracy is of paramount importance.
更多
查看译文
关键词
medical summaries,quality
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要