Variational inference (VI) is a popular approach in Bayesian inference, that looks for the best approximation of the posterior distribution within a parametric family, minimizing a loss that is typically the (reverse) Kullback-Leibler (KL) divergence. In this paper, we focus on the following parametric family: mixtures of isotropic Gaussians (i.e., with diagonal covariance matrices proportional to the identity) and uniform weights. We develop a variational framework and provide efficient algorithms suited for this family. In contrast with mixtures of Gaussian with generic covariance matrices, this choice presents a balance between accurate approximations of multimodal Bayesian posteriors, while being memory and computationally efficient. Our algorithms implement gradient descent on the location of the mixture components (the modes of the Gaussians), and either (an entropic) Mirror or Bures descent on their variance parameters. We illustrate the performance of our algorithms on numerical experiments.
翻译:变分推断(VI)是贝叶斯推断中一种常用方法,其通过在参数化分布族中寻找后验分布的最佳近似,通常以最小化(反向)Kullback-Leibler(KL)散度作为损失函数。本文聚焦于以下参数化分布族:各向同性高斯混合分布(即具有与单位矩阵成比例的对角协方差矩阵)与均匀权重。我们针对该分布族建立了变分推断框架,并提出了高效的计算算法。相较于使用通用协方差矩阵的高斯混合模型,该选择在实现对多峰贝叶斯后验分布的精确近似的同时,保持了内存与计算效率的平衡。我们的算法对混合分量位置(高斯分布的众数)实施梯度下降,并对其方差参数采用(熵正则化的)镜像下降或Bures下降优化。我们通过数值实验展示了算法的性能。