Accurate modeling of spatial dependence is pivotal in analyzing spatial data, influencing parameter estimation and predictions. The spatial structure of the data significantly impacts valid statistical inference. Existing models for areal data often rely on adjacency matrices, struggling to differentiate between polygons of varying sizes and shapes. Conversely, data fusion models rely on computationally intensive numerical integrals, presenting challenges for moderately large datasets. In response to these issues, we propose the Hausdorff-Gaussian process (HGP), a versatile model utilizing the Hausdorff distance to capture spatial dependence in both point and areal data. Integration into generalized linear mixed-effects models enhances its applicability, particularly in addressing data fusion challenges. We validate our approach through a comprehensive simulation study and application to two real-world scenarios: one involving areal data and another demonstrating its effectiveness in data fusion. The results suggest that the HGP is competitive with specialized models regarding goodness-of-fit and prediction performances. In summary, the HGP offers a flexible and robust solution for modeling spatial data of various types and shapes, with potential applications spanning fields such as public health and climate science.
翻译:空间依赖性的精确建模对于分析空间数据至关重要,直接影响参数估计与预测效果。数据的空间结构对有效的统计推断具有显著影响。现有的区域数据模型常依赖邻接矩阵,难以区分不同尺寸与形状的多边形。相反,数据融合模型依赖于计算密集的数值积分,对中等规模数据集构成挑战。针对这些问题,我们提出Hausdorff-高斯过程(HGP),这是一种利用Hausdorff距离捕捉点数据与区域数据空间依赖性的通用模型。通过将其整合到广义线性混合效应模型中,进一步拓展了其适用性,特别是在应对数据融合挑战方面。我们通过全面的模拟研究及两个实际场景应用验证了该方法的有效性:一个涉及区域数据,另一个展示了其在数据融合中的优越性能。结果表明,HGP在拟合优度与预测性能方面与专用模型相比具有竞争力。总之,HGP为各类形态的空间数据建模提供了灵活而稳健的解决方案,在公共卫生、气候科学等领域具有广阔的应用前景。