StyleGANs have shown impressive results on data generation and manipulation in recent years, thanks to its disentangled style latent space. A lot of efforts have been made in inverting a pretrained generator, where an encoder is trained ad hoc after the generator is trained in a two-stage fashion. In this paper, we focus on style-based generators asking a scientific question: Does forcing such a generator to reconstruct real data lead to more disentangled latent space and make the inversion process from image to latent space easy? We describe a new methodology to train a style-based autoencoder where the encoder and generator are optimized end-to-end. We show that our proposed model consistently outperforms baselines in terms of image inversion and generation quality. Supplementary, code, and pretrained models are available on the project website.
翻译:近年来,StyleGANs在数据生成和操作方面显示了令人印象深刻的成果,这得益于其分解的风格潜伏空间。我们做了许多努力来改变一个预培训的发电机,在发电机经过两阶段培训后,编码器是经过专门训练的。在本文中,我们侧重于基于风格的发电机,询问一个科学问题:迫使这样的发电机重建真实数据是否导致更分解的潜在空间,并使从图像到潜伏空间的转换过程变得容易?我们描述了一种新方法,用于培训一种基于风格的自动编码器,使编码器和发电机的尾端至端得到优化。我们展示了我们提议的模型在图像转换和生成质量方面始终超过基线。在项目网站上提供了补充、代码和预先培训的模型。