Region of Interest (ROI)-based image compression has rapidly developed due to its ability to maintain high fidelity in important regions while reducing data redundancy. However, existing compression methods primarily apply masks to suppress background information before quantization. This explicit bit allocation strategy, which uses hard gating, significantly impacts the statistical distribution of the entropy model, thereby limiting the coding performance of the compression model. In response, this work proposes an efficient ROI-based deep image compression model with implicit bit allocation. To better utilize ROI masks for implicit bit allocation, this paper proposes a novel Mask-Guided Feature Enhancement (MGFE) module, comprising a Region-Adaptive Attention (RAA) block and a Frequency-Spatial Collaborative Attention (FSCA) block. This module allows for flexible bit allocation across different regions while enhancing global and local features through frequencyspatial domain collaboration. Additionally, we use dual decoders to separately reconstruct foreground and background images, enabling the coding network to optimally balance foreground enhancement and background quality preservation in a datadriven manner. To the best of our knowledge, this is the first work to utilize implicit bit allocation for high-quality regionadaptive coding. Experiments on the COCO2017 dataset show that our implicit-based image compression method significantly outperforms explicit bit allocation approaches in rate-distortion performance, achieving optimal results while maintaining satisfactory visual quality in the reconstructed background regions.
翻译:基于感兴趣区域(ROI)的图像压缩技术因其能在保持重要区域高保真度的同时减少数据冗余而迅速发展。然而,现有压缩方法主要通过在量化前应用掩码来抑制背景信息。这种采用硬门控的显式比特分配策略显著影响了熵模型的统计分布,从而限制了压缩模型的编码性能。为此,本研究提出了一种基于隐式比特分配的高效ROI深度图像压缩模型。为更好地利用ROI掩码实现隐式比特分配,本文提出了一种新颖的掩码引导特征增强(MGFE)模块,该模块包含区域自适应注意力(RAA)块和频空协同注意力(FSCA)块。该模块允许在不同区域间灵活分配比特,同时通过频空域协同增强全局与局部特征。此外,我们采用双解码器分别重建前景与背景图像,使编码网络能够以数据驱动的方式最优平衡前景增强与背景质量保持。据我们所知,这是首个利用隐式比特分配实现高质量区域自适应编码的研究。在COCO2017数据集上的实验表明,我们的隐式图像压缩方法在率失真性能上显著优于显式比特分配方法,在保持重建背景区域令人满意的视觉质量的同时取得了最优结果。