Bayesian coresets have emerged as a promising approach for implementing scalable Bayesian inference. The Bayesian coreset problem involves selecting a (weighted) subset of the data samples, such that the posterior inference using the selected subset closely approximates the posterior inference using the full dataset. This manuscript revisits Bayesian coresets through the lens of sparsity constrained optimization. Leveraging recent advances in accelerated optimization methods, we propose and analyze a novel algorithm for coreset selection. We provide explicit convergence rate guarantees and present an empirical evaluation on a variety of benchmark datasets to highlight our proposed algorithm's superior performance compared to state-of-the-art on speed and accuracy.
翻译:Bayesian 核心元件已成为执行可缩放贝叶斯推断的一种很有希望的方法。 Bayesian 核心元件问题涉及选择数据样品的(加权)子集,例如使用选定子集的后继推论近似于使用完整数据集的后继推论。本手稿通过宽度限制优化的透镜重新审视Bayesian核心元件。利用加速优化方法的最新进展,我们提议和分析一个用于选择核心元件的新型算法。我们提供了明确的趋同率保证,并对各种基准数据集提出了实证评价,以突出我们提议的算法相对于速度和准确性的最新技术的优劣表现。