模型大规模非静止非高加索空间数据集可缩放分割法 (A Scalable Partitioned Approach to Model Massive Nonstationary Non-Gaussian Spatial Datasets)

Nonstationary non-Gaussian spatial data are common in many disciplines, including climate science, ecology, epidemiology, and social sciences. Examples include count data on disease incidence and binary satellite data on cloud mask (cloud/no-cloud). Modeling such datasets as stationary spatial processes can be unrealistic since they are collected over large heterogeneous domains (i.e., spatial behavior differs across subregions). Although several approaches have been developed for nonstationary spatial models, these have focused primarily on Gaussian responses. In addition, fitting nonstationary models for large non-Gaussian datasets is computationally prohibitive. To address these challenges, we propose a scalable algorithm for modeling such data by leveraging parallel computing in modern high-performance computing systems. We partition the spatial domain into disjoint subregions and fit locally nonstationary models using a carefully curated set of spatial basis functions. Then, we combine the local processes using a novel neighbor-based weighting scheme. Our approach scales well to massive datasets (e.g., 1 million samples) and can be implemented in nimble, a popular software environment for Bayesian hierarchical modeling. We demonstrate our method to simulated examples and two large real-world datasets pertaining to infectious diseases and remote sensing.

翻译：在气候科学、生态学、流行病学和社会科学等许多学科中,非静止非古日文空间数据是常见的,其中包括关于疾病发生率的计数数据和云面掩码(cloud/no-cloud)上的二进制卫星数据。将这类数据集建为固定空间过程可能不切实际,因为它们是在大型不同领域收集的(即各次区域的空间行为各不相同)。虽然为非静止空间模型制定了几种方法,但这些方法主要侧重于高斯人的反应。此外,为大型非古日文数据集安装适当的非静止模型在计算上令人望而却步。为了应对这些挑战,我们建议采用可扩缩的算法,在现代高性能计算系统中利用平行计算方法来模拟这些数据。我们将空间域分成不相交错的次区域,并利用一套精心调整的空间基础功能来适应当地非静止模型。然后,我们利用一种新型的邻居加权办法将当地程序结合起来。我们的方法对大规模非古代非古代非古代数据集(例如100万个样本)进行了精确的计算。为了应对这些挑战,我们可以用两种微小的软件环境来模拟与巴伊氏级疾病。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

多伦多大学2020春季CSC311课程「机器学习导论」，学习ML基础知识

专知会员服务

54+阅读 · 2020年1月13日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日