Gaussian processes (GPs) are ubiquitous tools for modeling and predicting continuous processes in physical and engineering sciences. This is partly due to the fact that one may employ a Gaussian process as an interpolator while facilitating straightforward uncertainty quantification at other locations. In addition to training data, it is sometimes the case that available information is not in the form of a finite collection of points. For example, boundary value problems contain information on the boundary of a domain, or underlying physics lead to known behavior on an entire uncountable subset of the domain of interest. While an approximation to such known information may be obtained via pseudo-training points in the known subset, such a procedure is ad hoc with little guidance on the number of points to use, nor the behavior as the number of pseudo-observations grows large. We propose and construct Gaussian processes that unify, via reproducing kernel Hilbert space, the typical finite training data case with the case of having uncountable information by exploiting the equivalence of conditional expectation and orthogonal projections in Hilbert space. We show existence of the proposed process and establish that it is the limit of a conventional GP conditioned on an increasing number of training points. We illustrate the flexibility and advantages of our proposed approach via numerical experiments.
翻译:高斯过程(Gaussian processes, GPs)是物理与工程科学中用于建模和预测连续过程的常用工具。这在一定程度上归因于高斯过程既能作为插值器使用,又能在其他位置提供直观的不确定性量化。除了训练数据外,有时可用信息并非以有限点集的形式存在。例如,边值问题包含域边界上的信息,或基础物理规律导致感兴趣域的某个不可数子集上存在已知行为。虽然可通过在已知子集上设置伪训练点来近似此类已知信息,但该过程缺乏系统性指导,既未明确应使用的点数,也未阐明当伪观测数量增加时的渐近行为。我们提出并构建了一种高斯过程,该方法通过再生核希尔伯特空间,将典型的有限训练数据情形与包含不可数信息的情形相统一,其核心在于利用希尔伯特空间中条件期望与正交投影的等价性。我们证明了所提出过程的存在性,并确立其可作为传统高斯过程在训练点数增加时的条件极限。通过数值实验,我们展示了所提出方法的灵活性与优势。