Deep generative models can emulate the perceptual properties of complex image datasets, providing a latent representation of the data. However, manipulating such representation to perform meaningful and controllable transformations in the data space remains challenging without some form of supervision. While previous work has focused on exploiting statistical independence to disentangle latent factors, we argue that such requirement is too restrictive and propose instead a non-statistical framework that relies on counterfactual manipulations to uncover a modular structure of the network composed of disentangled groups of internal variables. Experiments with a variety of generative models trained on complex image datasets show the obtained modules can be used to design targeted interventions. This opens the way to applications such as computationally efficient style transfer and the automated assessment of robustness to contextual changes in pattern recognition systems.
翻译:深层基因模型可以模仿复杂的图像数据集的感知特性,为数据提供潜在的代表性;然而,在没有某种形式的监督的情况下,操纵这种代表性以在数据空间进行有意义和可控的转换仍然具有挑战性;虽然以前的工作重点是利用统计独立性来解析潜在因素,但我们认为,这种要求限制性过强,并提议一个非统计性框架,依靠反事实操纵来发现网络的模块结构,由内部变数分解的一组组成;与经过复杂图像数据集培训的各种基因模型进行的实验表明,所获得的模块可用于设计有针对性的干预措施,这为应用诸如计算高效的风格转移和自动评估对模式识别系统上下文变化的稳健性等提供了途径。