The central objective function of a variational autoencoder (VAE) is its variational lower bound (the ELBO). Here we show that for standard (i.e., Gaussian) VAEs the ELBO converges to a value given by the sum of three entropies: the (negative) entropy of the prior distribution, the expected (negative) entropy of the observable distribution, and the average entropy of the variational distributions (the latter is already part of the ELBO). Our derived analytical results are exact and apply for small as well as for intricate deep networks for encoder and decoder. Furthermore, they apply for finitely and infinitely many data points and at any stationary point (including local maxima and saddle points). The result implies that the ELBO can for standard VAEs often be computed in closed-form at stationary points while the original ELBO requires numerical approximations of integrals. As a main contribution, we provide the proof that the ELBO for VAEs is at stationary points equal to entropy sums. Numerical experiments then show that the obtained analytical results are sufficiently precise also in those vicinities of stationary points that are reached in practice. Furthermore, we discuss how the novel entropy form of the ELBO can be used to analyze and understand learning behavior. More generally, we believe that our contributions can be useful for future theoretical and practical studies on VAE learning as they provide novel information on those points in parameters space that optimization of VAEs converges to.
翻译:变分自编码器的ELBO收敛于三个熵的总和
翻译后的摘要:
变分自编码器(VAE)的核心目标函数是其变分下限(ELBO)。本文表明,在标准(即,高斯)VAE中,ELBO收敛于三个熵的总和;包括先验分布的(负)熵、可观察分布的期望(负)熵和变分分布的平均熵(后者已经是ELBO的一部分)。我们推导的解析结果是精确的,并适用于编码器和解码器具有简单和复杂的深度网络的任何平稳点(包括局部最大值和马鞍点),以及有限和无限数量的数据点。该结果意味着,对于标准VAE,ELBO通常可以在平稳点处以封闭形式计算,而原始ELBO需要数值积分的近似。作为主要贡献,我们提供了证明ELBO对于VAE在平稳点处等于熵之和的证明。数值实验表明,所得到的解析结果在实践中达到的平稳点的邻域中也足够精确。此外,我们讨论了如何使用新的熵形式的ELBO来分析和理解学习行为。总之,我们认为,我们的贡献对于未来关于VAE学习的理论和实践研究可能是有用的,因为它们为优化VAE的参数空间中收敛点提供了新的信息。