A common failure mode of density models trained as variational autoencoders is to model the data without relying on their latent variables, rendering these variables useless. Two contributing factors, the underspecification of the model and the looseness of the variational lower bound, have been studied separately in the literature. We weave these two strands of research together, specifically the tighter bounds of Monte-Carlo objectives and constraints on the mutual information between the observable and the latent variables. Estimating the mutual information as the average Kullback-Leibler divergence between the easily available variational posterior $q(z|x)$ and the prior does not work with Monte-Carlo objectives because $q(z|x)$ is no longer a direct approximation to the model's true posterior $p(z|x)$. Hence, we construct estimators of the Kullback-Leibler divergence of the true posterior from the prior by recycling samples used in the objective, with which we train models of continuous and discrete latents at much improved rate-distortion and no posterior collapse. While alleviated, the tradeoff between modelling the data and using the latents still remains, and we urge for evaluating inference methods across a range of mutual information values.
翻译:作为变式自动计算器而培训的密度模型的共同失败模式是,在不依赖其潜在变量的情况下模拟数据,使这些变量毫无用处。文献中分别研究了两个因素,即模型的特性不足和变式较低约束的松散。我们将这两组研究编织在一起,特别是蒙特-卡洛目标的严格界限和对可观测变量和潜在变量之间相互信息的限制。将相互信息估计为容易获得的变式后背-利差平均差值,而以前的数据与蒙特-卡洛目标不起作用,因为$(zx)美元不再直接接近模型真正的后背差值。因此,我们构建了真实后背-利差值与先前目标中所用回收样品之间的估计值,我们用高得多的率扭曲和无后背崩溃来培训连续和离差潜差模型。与此同时,我们仍在利用共同的贸易模型评估各种数据,同时在评估稳定度中,我们仍在评估共同的贸易范围。