Token-Ensemble Text Generation: On Attacking the Automatic AI-Generated Text Detection
CoRR(2024)
摘要
The robustness of AI-content detection models against cultivated attacks
(e.g., paraphrasing or word switching) remains a significant concern. This
study proposes a novel token-ensemble generation strategy to challenge the
robustness of current AI-content detection approaches. We explore the ensemble
attack strategy by completing the prompt with the next token generated from
random candidate LLMs. We find the token-ensemble approach significantly drops
the performance of AI-content detection models (The code and test sets will be
released). Our findings reveal that token-ensemble generation poses a vital
challenge to current detection models and underlines the need for advancing
detection technologies to counter sophisticated adversarial strategies.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要