Learned image compression has recently shown the potential to outperform all standard codecs. The state-of-the-art rate-distortion performance has been achieved by context-adaptive entropy approaches in which hyperprior and autoregressive models are jointly utilized to effectively capture the spatial dependencies in the latent representations. However, the latents contain a mixture of high and low frequency information, which has inefficiently been represented by features maps of the same spatial resolution in previous works. In this paper, we propose the first learned multi-frequency image compression approach that uses the recently developed octave convolutions to factorize the latents into high and low frequencies. Since the low frequency is represented by a lower resolution, their spatial redundancy is reduced, which improves the compression rate. Moreover, octave convolutions impose effective high and low frequency communication, which can improve the reconstruction quality. We also develop novel generalized octave convolution and octave transposed-convolution architectures with internal activation layers to preserve the spatial structure of the information. Our experiments show that the proposed scheme outperforms all standard codecs and learning-based methods in both PSNR and MS-SSIM metrics, and establishes the new state of the art for learned image compression.
翻译:图像压缩最近显示,图像压缩有可能超越所有标准代码。 最先进的速率扭曲性性能已经通过环境适应性诱变方法实现了。 在这种方法中,超位和自动递减模型被共同利用,以有效捕捉潜表层的空间依赖性。 但是,潜层包含高频和低频信息的混合,而高频和低频信息被同一空间分辨率的特征地图所代表,这些特征没有效率。 在本文件中,我们建议了第一个学习的多频图像压缩方法,利用最近开发的八进变变将潜成高频和低频。由于低分辨率代表低频,因此其空间冗余减少,从而改进了压缩率。 此外, 轨变动还带来了有效的高频和低频通信,从而可以提高重建质量。 我们还开发了新型的通用电离子变异和电转换结构,并配有内部激活层来保存信息的空间结构。 我们的实验显示, 拟议的计划在PSNIS和MS- IM 中, 建立了新的标准化和基于学习的图像升级方法。