Provably Secure Disambiguating Neural Linguistic Steganography
arxiv(2024)
摘要
Recent research in provably secure neural linguistic steganography has
overlooked a crucial aspect: the sender must detokenize stegotexts to avoid
raising suspicion from the eavesdropper. The segmentation ambiguity problem,
which arises when using language models based on subwords, leads to occasional
decoding failures in all neural language steganography implementations based on
these models. Current solutions to this issue involve altering the probability
distribution of candidate words, rendering them incompatible with provably
secure steganography. We propose a novel secure disambiguation method named
SyncPool, which effectively addresses the segmentation ambiguity problem. We
group all tokens with prefix relationships in the candidate pool before the
steganographic embedding algorithm runs to eliminate uncertainty among
ambiguous tokens. To enable the receiver to synchronize the sampling process of
the sender, a shared cryptographically-secure pseudorandom number generator
(CSPRNG) is deployed to select a token from the ambiguity pool. SyncPool does
not change the size of the candidate pool or the distribution of tokens and
thus is applicable to provably secure language steganography methods. We
provide theoretical proofs and experimentally demonstrate the applicability of
our solution to various languages and models, showing its potential to
significantly improve the reliability and security of neural linguistic
steganography systems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要