Detecting Artificially Generated Academic Text: The Importance of Mimicking Human Utilization of Large Language Models.

NLDB（2023）

引用 1|浏览0

暂无评分

摘要

The advent of Large Language Models (LLMs) has led to a surge in Natural Language Generation (NLG), aiding humans in composing text for various tasks. However, there is a risk of these models being misused. For instance, detecting artificially generated text from original text is a concern in academia. Current research works on detection do not attempt to replicate how humans would use these models. In our work, we address this issue by leveraging data generated by mimicking how humans would use LLMs in composing academic works. Our study examines the detectability of the generated text using DetectGPT and GLTR, and we utilize state-of-the-art classification models like SciBERT, RoBERTa, DEBERTa, XLNet, and ELECTRA. Our experiments show that the generated text is difficult to detect using existing models when created using a LLM fine-tuned on the remainder of a paper. This highlights the importance of using realistic and challenging datasets in future research aimed at detecting artificially generated text.

查看译文

关键词

academic text,mimicking human utilization,models

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要