Information Hiding Through Errors: A Confusing Approach

PROCEEDINGS OF THE SOCIETY OF PHOTO-OPTICAL INSTRUMENTATION ENGINEERS (SPIE)(2007)

引用 76|浏览27
暂无评分
摘要
A substantial portion of the text available online is of a kind that tends to contain many typos and ungrammatical abbreviations, e.g., emails, blogs, forums. It is therefore not surprising that, in such texts. one can carry out information-hiding by the judicious injection of typos (broadly construed to include abbreviations and acronyms). What is surprising is that, as this paper demonstrates, this form of embedding can be made quite resilient. The resilience is achieved through the use of computationally asymmetric transformations (CAT for short): Transform at ions that can be carried out inexpensively, yet reversing them requires much more extensive semantic analyses (easy for humans to carry out, but hard to automate). An example of CAT is transformations that consist of introducing typos that are ambiguous in that they have many possible corrections, making them harder to automatically restore to their original form: Then considering alternative typos, we prefer ones that are also close to other vocabulary words. Such encodings do riot materially degrade the text's meaning because, compared to machines. humans are very good at disambiguation. We use typo confusion matrices and word level ambiguity to carry out this kind of encoding. Unlike robust synonym substitution that also cleverly used ambiguity, the task here is harder because typos are very conspicuous and an obvious target for the adversary (synonyms are stealthy, typos are riot). Our resilience does riot depend on preventing the adversary from correcting without damage: It only depends on a multiplicity of alternative corrections. In fact, even an adversary who has boldly "corrected" all the typos by randomly choosing from the ambiguous alternatives has, oil average, destroyed around w/4 of our w-bit mark (and incurred a high cost in terms of the damage done to the meaning of the text)
更多
查看译文
关键词
matrices,data hiding,computer programming,information hiding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要