Masked language models directly encode linguistic uncertainty

SCIL(2022)

引用 0|浏览0
暂无评分
摘要
Recent advances in human language processing research have suggested that the predictive power of large language models (LLMs) can serve as cognitive models of human language processing. Evidence for this comes from LLMs’ close fit to human psychophysical data, such as reaction times or brain responses in language comprehension experiments. Those adopting LLM architectures as models of human language processing frame the problem of language comprehension as prediction of the next linguistic event (Goodkind and Bicknell, 2018; Eisape et al., 2020), in particular focusing on lexical or syntactic surprisal. However, this approach fails to consider that comprehenders make predictions using some representation of the content of an utterance. That is, in contrast to surprisal, readers make use of a mental model that creates an evolving understanding of who is doing what to whom and how. In contrast to comprehenders, surprisal measures do not make predictions about the content, as surprisal simply measures the conditional probability of some linguistic event given the surrounding context. Many convergent cues in the upstream context, such as the frequencies of words in a sentence so far, will affect hidden state representations of models, which may then influence the predictability of upcoming words. The present work deviates from the surprisal paradigm by assessing how much the hidden state representations of LLMs, which are the source of the predictive power that LLMs have over symbolic representations, encode human language processing-relevant uncertainty. We specifically assess this possibility using the stimulus set from Federmeier et al. (2007), which contains sentences that manipulated the predictability of a final word by designing the sentences to be either strongly or weakly constraining. We therefore sought to test whether it is possible to predict constraint from the sentence embeddings directly to better understand whether and how linguistic uncertainty is encoded in hidden states.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要