An Entropy-based Text Watermarking Detection Method
arxiv(2024)
摘要
Currently, text watermarking algorithms for large language models (LLMs) can
embed hidden features to texts generated by LLMs to facilitate subsequent
detection, thus alleviating the problem of misuse of LLMs. Although the current
text watermarking algorithms perform well in most high-entropy scenarios, its
performance in low-entropy scenarios still needs to be improved. In this work,
we proposed that the influence of token entropy should be fully considered in
the watermark detection process, that is, the weight of each token should be
adjusted according to its entropy during watermark detection, rather than
setting the weight of all tokens to the same value as in previous methods.
Specifically, we proposed an Entropy-based Watermark Detection (EWD) that gives
higher-entropy tokens higher weights during watermark detection, so as to
better reflect the degree of watermarking. Furthermore, the proposed detection
process is training-free and fully automated.
proxy-LLM to calculate the entropy of each token, without the need to use the
original LLM. In the experiment, we found that our method can achieve better
detection performance in low-entropy scenarios, and our method is also general
and can be applied to texts with different entropy distributions. Our code and
data will be available online.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要