Whose LLM is it Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard
CoRR(2024)
摘要
Large Language Models (LLMs) are capable of generating text that is similar
to or surpasses human quality. However, it is unclear whether LLMs tend to
exhibit distinctive linguistic styles akin to how human authors do. Through a
comprehensive linguistic analysis, we compare the vocabulary, Part-Of-Speech
(POS) distribution, dependency distribution, and sentiment of texts generated
by three of the most popular LLMS today (GPT-3.5, GPT-4, and Bard) to diverse
inputs. The results point to significant linguistic variations which, in turn,
enable us to attribute a given text to its LLM origin with a favorable 88%
accuracy using a simple off-the-shelf classification model. Theoretical and
practical implications of this intriguing finding are discussed.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要