How Good Are LLMs at Out-of-Distribution Detection?
arxiv(2023)
摘要
Out-of-distribution (OOD) detection plays a vital role in enhancing the
reliability of machine learning (ML) models. The emergence of large language
models (LLMs) has catalyzed a paradigm shift within the ML community,
showcasing their exceptional capabilities across diverse natural language
processing tasks. While existing research has probed OOD detection with
relative small-scale Transformers like BERT, RoBERTa and GPT-2, the stark
differences in scales, pre-training objectives, and inference paradigms call
into question the applicability of these findings to LLMs. This paper embarks
on a pioneering empirical investigation of OOD detection in the domain of LLMs,
focusing on LLaMA series ranging from 7B to 65B in size. We thoroughly evaluate
commonly-used OOD detectors, scrutinizing their performance in both zero-grad
and fine-tuning scenarios. Notably, we alter previous discriminative
in-distribution fine-tuning into generative fine-tuning, aligning the
pre-training objective of LLMs with downstream tasks. Our findings unveil that
a simple cosine distance OOD detector demonstrates superior efficacy,
outperforming other OOD detectors. We provide an intriguing explanation for
this phenomenon by highlighting the isotropic nature of the embedding spaces of
LLMs, which distinctly contrasts with the anisotropic property observed in
smaller BERT family models. The new insight enhances our understanding of how
LLMs detect OOD data, thereby enhancing their adaptability and reliability in
dynamic environments. We have released the source code at
for other researchers to reproduce
our results.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要