ICMH-Net: Neural Image Compression Towards both Machine Vision and Human Vision

MM '23: Proceedings of the 31st ACM International Conference on Multimedia(2023)

引用 1|浏览4
Neural image compression has gained significant attention thanks to the remarkable success of deep neural networks. However, most existing neural image codecs focus solely on improving human vision perception. In this work, our objective is to enhance image compression methods for both human vision quality and machine vision tasks simultaneously. To achieve this, we introduce a novel approach to Partition, Transmit, Reconstruct, and Aggregate (PTRA) the latent representation of images to balance the optimizations for both aspects. By employing our method as a module in existing neural image codecs, we create a latent representation predictor that dynamically manages the bit-rate cost for machine vision tasks. To further improve the performance of auto-regressive-based coding techniques, we enhance our hyperprior network and predictor module with context modules, resulting in a reduction in bit-rate. The extensive experiments conducted on various machine vision benchmarks such as ILSVRC 2012, VOC 2007, VOC 2012, and COCO demonstrate the superiority of our newly proposed image compression framework. It outperforms existing neural image compression methods in multiple machine vision tasks including classification, segmentation, and detection, while maintaining high-quality image reconstruction for human vision.
AI 理解论文
Chat Paper