Language models align with human judgments on key grammatical constructions
CoRR(2024)
摘要
Do Large Language Models (LLMs) make human-like linguistic generalizations?
Dentella et al. (2023; "DGL") prompt several LLMs ("Is the following sentence
grammatically correct in English?") to elicit grammaticality judgments of 80
English sentences, concluding that LLMs demonstrate a "yes-response bias" and a
"failure to distinguish grammatical from ungrammatical sentences". We
re-evaluate LLM performance using well-established practices and find that
DGL's data in fact provide evidence for just how well LLMs capture human
behaviors. Models not only achieve high accuracy overall, but also capture
fine-grained variation in human linguistic judgments.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要