Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models
CoRR(2024)
摘要
In the field of neural data compression, the prevailing focus has been on
optimizing algorithms for either classical distortion metrics, such as PSNR or
SSIM, or human perceptual quality. With increasing amounts of data consumed by
machines rather than humans, a new paradigm of machine-oriented
compressionx2013which prioritizes the retention of features salient
for machine perception over traditional human-centric
criteriax2013has emerged, creating several new challenges to the
development, evaluation, and deployment of systems utilizing lossy compression.
In particular, it is unclear how different approaches to lossy compression will
affect the performance of downstream machine perception tasks. To address this
under-explored area, we evaluate various perception
modelsx2013including image classification, image segmentation,
speech recognition, and music source separationx2013under severe
lossy compression. We utilize several popular codecs spanning conventional,
neural, and generative compression architectures. Our results indicate three
key findings: (1) using generative compression, it is feasible to leverage
highly compressed data while incurring a negligible impact on machine
perceptual quality; (2) machine perceptual quality correlates strongly with
deep similarity metrics, indicating a crucial role of these metrics in the
development of machine-oriented codecs; and (3) using lossy compressed
datasets, (e.g. ImageNet) for pre-training can lead to counter-intuitive
scenarios where lossy compression increases machine perceptual quality rather
than degrading it. To encourage engagement on this growing area of research,
our code and experiments are available at:
https://github.com/danjacobellis/MPQ.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要