A fundamental task in kernel methods is to pick nodes and weights, so as to approximate a given function from an RKHS by the weighted sum of kernel translates located at the nodes. This is the crux of kernel density estimation, kernel quadrature, or interpolation from discrete samples. Furthermore, RKHSs offer a convenient mathematical and computational framework. We introduce and analyse continuous volume sampling (VS), the continuous counterpart -- for choosing node locations -- of a discrete distribution introduced in (Deshpande & Vempala, 2006). Our contribution is theoretical: we prove almost optimal bounds for interpolation and quadrature under VS. While similar bounds already exist for some specific RKHSs using ad-hoc node constructions, VS offers bounds that apply to any Mercer kernel and depend on the spectrum of the associated integration operator. We emphasize that, unlike previous randomized approaches that rely on regularized leverage scores or determinantal point processes, evaluating the pdf of VS only requires pointwise evaluations of the kernel. VS is thus naturally amenable to MCMC samplers.
翻译:内核方法的一项基本任务是选取节点和重量,以便通过位于节点上的内核翻转加权总和,从RKHS中将某一函数从RKHS中近似。这是离散样本中内核密度估计、内核二次曲线或内插的柱体。此外,RKHS提供了一个方便的数学和计算框架。我们引入和分析连续量抽样(VS),即选择在(Deshpande和Vempala,2006年)中引入的离散分布点的连续对应方,我们的贡献是理论性的:我们证明VS下的内核内核和二次曲线的加权结合几乎是最佳的界限。虽然某些特定的RKHS已经存在类似的界限,但使用反热节点构造,VS提供了适用于任何Mercer内核的界限,并取决于相关整合操作者的频谱。我们强调,与以前依靠正规杠杆分或确定点过程的随机方法不同的是,评价VS的pdf只要求对内核内核进行点评价。VS的样品自然可适用于MC。