Generative adversarial networks offer the possibility to generate deceptively real images that are almost indistinguishable from actual photographs. Such systems however rely on the presence of large datasets to realistically replicate the corresponding domain. This is especially a problem if not only random new images are to be generated, but specific (continuous) features are to be co-modeled. A particularly important use case in \emph{Human-Computer Interaction} (HCI) research is the generation of emotional images of human faces, which can be used for various use cases, such as the automatic generation of avatars. The problem hereby lies in the availability of training data. Most suitable datasets for this task rely on categorical emotion models and therefore feature only discrete annotation labels. This greatly hinders the learning and modeling of smooth transitions between displayed affective states. To overcome this challenge, we explore the potential of label interpolation to enhance networks trained on categorical datasets with the ability to generate images conditioned on continuous features.
翻译:生成对抗性网络提供了生成与实际照片几乎无法区分的真实真实图像的可能性。 但是,这些系统依赖于大型数据集的存在来现实地复制相应的域。 如果不仅要生成随机的新图像,而且要共同制作具体的(连续的)特征,这尤其是一个问题。 在 emph{Human-Computer Excessence} (HCI) 的研究中,一个特别重要的实用案例是生成人类面部的情感图像,这些图像可用于各种使用案例,例如自动生成变异体。 问题就在于是否具备培训数据。 这项任务最合适的数据集依赖于直线情感模型,因此只能使用离散的注解标签。 这极大地阻碍了在显示的感性状态之间平稳过渡的学习和建模。 为了克服这一挑战,我们探索了标签内插的潜力,以加强在直线数据集上培训的网络,能够生成以连续特性为条件的图像。