Various controls over the generated data can be extracted from the latent space of a pre-trained GAN, as it implicitly encodes the semantics of the training data. The discovered controls allow to vary semantic attributes in the generated images but usually lead to entangled edits that affect multiple attributes at the same time. Supervised approaches typically sample and annotate a collection of latent codes, then train classifiers in the latent space to identify the controls. Since the data generated by GANs reflects the biases of the original dataset, so do the resulting semantic controls. We propose to address disentanglement by subsampling the generated data to remove over-represented co-occuring attributes thus balancing the semantics of the dataset before training the classifiers. We demonstrate the effectiveness of this approach by extracting disentangled linear directions for face manipulation on two popular GAN architectures, PGGAN and StyleGAN, and two datasets, CelebAHQ and FFHQ. We show that this approach outperforms state-of-the-art classifier-based methods while avoiding the need for disentanglement-enforcing post-processing.
翻译:对生成的数据的各种控制可以从经过预先训练的GAN的隐蔽空间中提取,因为它暗含了培训数据的语义编码。 发现的控制允许在生成的图像中不同语义属性, 但通常导致同时影响多个属性的纠缠编辑。 受监督的方法一般是样本和批注潜在代码的集合, 然后在隐蔽空间中培训分类人员以识别控制。 由于GAN生成的数据反映了原始数据集的偏差, 由此产生的语义控制也是如此。 我们提议通过子取样处理生成的数据的分解, 从而消除代表过多的共闭合属性, 从而在培训分类者之前平衡数据集的语义 。 我们通过在两种流行的GAN 结构( PGGAN 和 StylegGAN) 和两个数据集( CelibebAHQ 和 FFHQ) 上分离线性方向来显示这一方法的有效性。 我们表明, 这种方法在避免解动后处理需要的情况下, 将状态与基于 解析器的方法相悖 。