This paper addresses the problem of approximating an unknown function from point evaluations. When obtaining these point evaluations is costly, minimising the required sample size becomes crucial, and it is unreasonable to reserve a sufficiently large test sample for estimating the approximation accuracy. Therefore, an approximation with a certified quasi-optimality factor is required. This article shows that such an approximation can be obtained when the sought function lies in a reproducing kernel Hilbert space (RKHS) and is to be approximated in a finite-dimensional linear subspace $\mathcal{V}_d$. However, selecting the sample points to minimise the quasi-optimality factor requires optimising over an infinite set of points and computing exact inner products in RKHS, which is often infeasible in practice. Extending results from optimal sampling for $L^2$ approximation, the present paper proves that random points, drawn independently from the Christoffel sampling distribution associated with $\mathcal{V}_d$, can yield a controllable quasi-optimality factor with high probability. Inspired by this result, a novel sampling scheme, coined subspace-informed volume sampling, is introduced and evaluated in numerical experiments, where it outperforms classical i.i.d. Christoffel sampling and continuous volume sampling. To reduce the size of such a random sample, an additional greedy subsampling scheme with provable suboptimality bounds is introduced. Our presentation is of independent interest to the inverse problems community, as it offers a simpler interpretation of the parametrised background data weak (PBDW) method.
翻译:本文研究了基于点值评估逼近未知函数的问题。当获取这些点值评估成本高昂时,最小化所需样本量变得至关重要,且为估计逼近精度而预留足够大的测试样本是不合理的。因此,需要一种具有可证明拟最优性因子的逼近方法。本文证明,当目标函数位于再生核希尔伯特空间(RKHS)中且需在有限维线性子空间$\\mathcal{V}_d$内逼近时,可获得此类逼近。然而,为最小化拟最优性因子而选择样本点,需在无限点集上优化并计算RKHS中的精确内积,这在实际中往往不可行。通过扩展$L^2$逼近中最优采样的结果,本文证明从与$\\mathcal{V}_d$相关的Christoffel采样分布中独立抽取的随机点,能以高概率产生可控的拟最优性因子。受此启发,本文提出了一种新颖的采样方案——子空间信息体积采样,并在数值实验中验证其性能优于经典独立同分布Christoffel采样和连续体积采样。为缩减此类随机样本的规模,进一步引入了一种具有可证明次优性界的贪婪子采样方案。本研究对反问题领域具有独立意义,因为它为参数化背景数据弱(PBDW)方法提供了更简洁的诠释。