通过梯度非统一分抽样,改进微量微量微量微量微积分抽样的精确度 (Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via Non-uniform Subsampling of Gradients)

Many Markov Chain Monte Carlo (MCMC) methods leverage gradient information of the potential function of target distribution to explore sample space efficiently. However, computing gradients can often be computationally expensive for large scale applications, such as those in contemporary machine learning. Stochastic Gradient (SG-)MCMC methods approximate gradients by stochastic ones, commonly via uniformly subsampled data points, and achieve improved computational efficiency, however at the price of introducing sampling error. We propose a non-uniform subsampling scheme to improve the sampling accuracy. The proposed exponentially weighted stochastic gradient (EWSG) is designed so that a non-uniform-SG-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method, and hence the inaccuracy due to SG approximation is reduced. EWSG differs from classical variance reduction (VR) techniques as it focuses on the entire distribution instead of just the variance; nevertheless, its reduced local variance is also proved. EWSG can also be viewed as an extension of the importance sampling idea, successful for stochastic-gradient-based optimizations, to sampling tasks. In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index, which is coupled to the MCMC algorithm. Numerical experiments are provided, not only to demonstrate EWSG's effectiveness, but also to guide hyperparameter choices, and validate our \emph{non-asymptotic global error bound} despite of approximations in the implementation. Notably, while statistical accuracy is improved, convergence speed can be comparable to the uniform version, which renders EWSG a practical alternative to VR (but EWSG and VR can be combined too).

翻译：许多Markov链条蒙特卡洛(MCM)方法利用Vorov 链条的梯度信息,让目标分布的潜在功能能够高效地探索样本空间。然而,计算梯度的计算成本对于大规模应用,例如当代机器学习中的应用而言,往往会非常昂贵。Stochactic Gradient(SG-MMC)方法通过随机偏差(Stochactical)法,通常通过统一的子抽样数据点,来估计梯度,并实现更高的计算效率,然而,以引入抽样错误为代价。我们提议了一个非统一分解的次抽样计划,以提高取样的准确性。拟议的超指数级加权随机偏差(EWSG)的设计使得非统一的SG-MC(SG-MC)标准方法能够模拟分批递升的统计行为,因此由于SG(SG)的近似差差差,因此与传统的变差减少(VR)技术不同,然而,其本地差异也可以被证明。EWSG(ESG)也被视为重要取样概念的延伸,成功进行精度的精度的精度的精度精确度评估, SG-Slodialalalalal-alal-selformal 工作在SG(SG)执行过程中,而没有进行实际的精度的精度的精度,在SG-SG-SG(SB-G-SG)的精度的精度上,只是的精度的精度的精度的精度的精度的精度的精度,在SBAFAVDLI的精度上进行。