Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
CoRR(2024)
摘要
Large language models generate high-quality responses with potential
misinformation, underscoring the need for regulation by distinguishing
AI-generated and human-written texts. Watermarking is pivotal in this context,
which involves embedding hidden markers in texts during the LLM inference
phase, which is imperceptible to humans. Current watermarking algorithms,
however, face the challenge of achieving both the detectability of inserted
watermarks and the semantic integrity of generated texts, where enhancing one
aspect often undermines the other. To overcome this, we introduce a novel
multi-objective optimization (MOO) approach for watermarking that utilizes
lightweight networks to generate token-specific watermarking logits and
splitting ratios. By leveraging MOO to optimize for both detection and semantic
objective functions, our method simultaneously achieves detectability and
semantic integrity. Experimental results show that our method outperforms
current watermarking techniques in enhancing the detectability of texts
generated by LLMs while maintaining their semantic coherence. Our code is
available at https://github.com/mignonjia/TS_watermark .
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要