Finesse: Fine-Grained Feature Locality Based Fast Resemblance Detection For Post-Deduplication Delta Compression

PROCEEDINGS OF THE 17TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES(2019)

引用 29|浏览27
暂无评分
摘要
In storage systems, delta compression is often used as a complementary data reduction technique for data deduplication because it is able to eliminate redundancy among the non-duplicate but highly similar chunks. Currently, what we call `N-transform Super-Feature' (N-transform SF) is the most popular and widely used approach to computing data similarity for detecting delta compression candidates. But our observations suggest that the N-transform SF is compute-intensive: it needs to linearly transform each Rabin fingerprint of the data chunks N times to obtain N features, and can be simplified by exploiting the fine-grained feature locality existing among highly similar chunks to eliminate time-consuming linear transformations. Therefore, we propose Finesse, a fine-grained feature-locality-based fast resemblance detection approach that divides each chunk into several fixed-sized subchunks, computes features from these subchunks individually, and then groups the features into super-features. Experimental results show that, compared with the state-of-the-art N-transform SF approach, Finesse accelerates the similarity computation for resemblance detection by 3.2x similar to 3.5x and increases the final throughput of a deduplicated and delta compressed prototype system by 41%similar to 85%, while achieving comparable compression ratios.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要