Exploring the Limits of Semantic Image Compression at Micro-bits per Pixel
CoRR(2024)
摘要
Traditional methods, such as JPEG, perform image compression by operating on
structural information, such as pixel values or frequency content. These
methods are effective to bitrates around one bit per pixel (bpp) and higher at
standard image sizes. In contrast, text-based semantic compression directly
stores concepts and their relationships using natural language, which has
evolved with humans to efficiently represent these salient concepts. These
methods can operate at extremely low bitrates by disregarding structural
information like location, size, and orientation. In this work, we use GPT-4V
and DALL-E3 from OpenAI to explore the quality-compression frontier for image
compression and identify the limitations of current technology. We push
semantic compression as low as 100 μbpp (up to 10,000× smaller than
JPEG) by introducing an iterative reflection process to improve the decoded
image. We further hypothesize this 100 μbpp level represents a soft limit
on semantic compression at standard image resolutions.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要