Exploring the Limits of Semantic Image Compression at Micro-bits per Pixel

CoRR(2024)

引用 0|浏览3
暂无评分
摘要
Traditional methods, such as JPEG, perform image compression by operating on structural information, such as pixel values or frequency content. These methods are effective to bitrates around one bit per pixel (bpp) and higher at standard image sizes. In contrast, text-based semantic compression directly stores concepts and their relationships using natural language, which has evolved with humans to efficiently represent these salient concepts. These methods can operate at extremely low bitrates by disregarding structural information like location, size, and orientation. In this work, we use GPT-4V and DALL-E3 from OpenAI to explore the quality-compression frontier for image compression and identify the limitations of current technology. We push semantic compression as low as 100 μbpp (up to 10,000× smaller than JPEG) by introducing an iterative reflection process to improve the decoded image. We further hypothesize this 100 μbpp level represents a soft limit on semantic compression at standard image resolutions.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要