GAN潜空间的逆向保真性、感知性和可编辑性：LSAP的重新思考 (LSAP: Rethinking Inversion Fidelity, Perception and Editability in GAN Latent Space)

As the methods evolve, inversion is mainly divided into two steps. The first step is Image Embedding, in which an encoder or optimization process embeds images to get the corresponding latent codes. Afterward, the second step aims to refine the inversion and editing results, which we named Result Refinement. Although the second step significantly improves fidelity, perception and editability are almost unchanged, deeply dependent on inverse latent codes attained in the first step. Therefore, a crucial problem is gaining the latent codes with better perception and editability while retaining the reconstruction fidelity. In this work, we first point out that these two characteristics are related to the degree of alignment (or disalignment) of the inverse codes with the synthetic distribution. Then, we propose Latent Space Alignment Inversion Paradigm (LSAP), which consists of evaluation metric and solution for this problem. Specifically, we introduce Normalized Style Space ($\mathcal{S^N}$ space) and $\mathcal{S^N}$ Cosine Distance (SNCD) to measure disalignment of inversion methods. Since our proposed SNCD is differentiable, it can be optimized in both encoder-based and optimization-based embedding methods to conduct a uniform solution. Extensive experiments in various domains demonstrate that SNCD effectively reflects perception and editability, and our alignment paradigm archives the state-of-the-art in both two steps. Code is available on https://github.com/caopulan/GANInverter/tree/main/configs/lsap.

翻译：随着方法的不断发展，逆向主要分为两个步骤。第一步是图像嵌入，其中编码器或优化过程将图像嵌入以获得相应的潜空间代码。之后，第二步旨在改善逆向和编辑结果，我们将其称为结果优化。尽管第二步显着提高了保真性，但感知性和可编辑性几乎未改变，深度依赖于在第一步中得到的逆向潜空间代码。因此，一个关键问题是在保留重构保真性的同时获得具有更好感知性和可编辑性的潜空间代码。在这项工作中，我们首先指出这两个特征与逆向代码与合成分布的对齐（或不对齐）程度有关。然后，我们提出了潜空间对齐逆向范式（LSAP），其中包括解决此问题的评估指标和解决方案。具体而言，我们引入了归一化风格空间（$ \mathcal {S ^ N} $空间）和$ \mathcal {S ^ N} $余弦距离（SNCD）来衡量逆向方法的不对齐性。由于我们提出的SNCD是可微分的，因此可以在基于编码器和基于优化的嵌入方法中进行优化，以进行统一的解决方案。在各个领域进行的大量实验证明了SNCD有效地反映了感知性和可编辑性，而我们的对齐范式在两个步骤中均达到了最先进水平。代码可在https://github.com/caopulan/GANInverter/tree/main/configs/lsap 上找到。