Distinguishing the Knowable from the Unknowable with Language Models
CoRR(2024)
摘要
We study the feasibility of identifying epistemic uncertainty (reflecting a
lack of knowledge), as opposed to aleatoric uncertainty (reflecting entropy in
the underlying distribution), in the outputs of large language models (LLMs)
over free-form text. In the absence of ground-truth probabilities, we explore a
setting where, in order to (approximately) disentangle a given LLM's
uncertainty, a significantly larger model stands in as a proxy for the ground
truth. We show that small linear probes trained on the embeddings of frozen,
pretrained models accurately predict when larger models will be more confident
at the token level and that probes trained on one text domain generalize to
others. Going further, we propose a fully unsupervised method that achieves
non-trivial accuracy on the same task. Taken together, we interpret these
results as evidence that LLMs naturally contain internal representations of
different types of uncertainty that could potentially be leveraged to devise
more informative indicators of model confidence in diverse practical settings.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要