Reconstructing Corrupt Deflated Files

DIGITAL INVESTIGATION(2011)

引用 10|浏览1
暂无评分
摘要
We present a method by which to determine a synchronzation point within a DEFLATE-compressed bit stream (as used in Zip and gzip archives) for which the beginning is unknown or damaged. Decompressing from the synchronization point forward yields a mixed stream of literal bytes and co-indexed unknown bytes. Language modeling in the form of byte trigrams and word unigrams is then applied to the resulting stream to infer probable replacements for each co-indexed unknown byte. Unique inferences can be made for approximately 30% of the co-indices, permitting reconstruction of approximately 75% of the unknown bytes recovered from the compressed data with accuracy in excess of 90%. The program implementing these techniques is available as open-source software. (C) 2011 R. Brown. Published by Elsevier Ltd. All rights reserved.
更多
查看译文
关键词
Data recovery,File reconstruction,DEFLATE compression,Zip archive,Language modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要