Block-Oriented Dense Compressor

Data Compression Conference(2011)

引用 4|浏览0
We address the problem of block-oriented natural language compression. Adaptive and semi-adaptive compression methods are nowadays very common in natural language compression field, each of them with different application possibilities. The block-oriented compression is semi-adaptive in terms of one block but it is adaptive in terms of whole input. Our block-oriented compression method is based on the Dense Code idea. It achieves very good compression ratio around 32 % on natural language text and proved to be very fast in searching on the compressed text. We show that our method has some interesting properties which could be applied on digital libraries. The compression method allows direct searching on compressed text. Moreover the vocabulary can be used as a block index which makes some kinds of searching very fast. Another property is that the compressor can send single blocks with corresponding vocabulary which is considerate to limited bandwidth. In addition the compressed file can be continuously extended without need of previous decompression.
block-oriented dense compressor,compression method,block-oriented compression,block-oriented natural language compression,corresponding vocabulary,block-oriented compression method,block index,natural language text,good compression ratio,semi-adaptive compression method,natural language compression field,natural languages,dictionaries,text analysis,natural language,digital libraries,encoding,natural language processing,real time systems,data compression,digital library
AI 理解论文
Chat Paper