Towards objectively evaluating the quality of generated medical summaries
Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)(2021)
摘要
We propose a method for evaluating the quality of generated text by asking evaluators to count facts, and computing precision, recall, f-score, and accuracy from the raw counts. We believe this approach leads to a more objective and easier to reproduce evaluation. We apply this to the task of medical report summarisation, where measuring objective quality and accuracy is of paramount importance.
更多查看译文
关键词
medical summaries,quality
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要