灵活和稳健的可交换性非参数性检验标准 (A flexible and robust non-parametric test of exchangeability) - 专知论文

会员服务 ·

0

可交换的 · Extensibility · 统计量 · 同质 · 簇 ·

2021 年 9 月 30 日

A flexible and robust non-parametric test of exchangeability

翻译：灵活和稳健的可交换性非参数性检验标准

Alan J. Aw,Jeffrey P. Spence,Yun S. Song

from arxiv, 25 pages (excluding Supplementary Material and Appendices)

Many statistical analyses assume that the data points within a sample are exchangeable and their features have some known dependency structure. Given a feature dependency structure, one can ask if the observations are exchangeable, in which case we say that they are homogeneous. Homogeneity may be the end goal of a clustering algorithm or a justification for not clustering. Apart from random matrix theory approaches, few general approaches provide statistical guarantees of exchangeability or homogeneity without labeled examples from distinct clusters. We propose a fast and flexible non-parametric hypothesis testing approach that takes as input a multivariate individual-by-feature dataset and user-specified feature dependency constraints, without labeled examples, and reports whether the individuals are exchangeable at a user-specified significance level. Our approach controls Type I error across realistic scenarios and handles data of arbitrary dimension. We perform an extensive simulation study to evaluate the efficacy of domain-agnostic tests of stratification, and find that our approach compares favorably in various scenarios of interest. Finally, we apply our approach to post-clustering single-cell chromatin accessibility data and World Values Survey data, and show how it helps to identify drivers of heterogeneity and generate clusters of exchangeable individuals.

翻译：许多统计分析假定,抽样中的数据点可以互换,其特征具有一些已知的依赖性结构。鉴于特征依赖性结构,人们可以问这些观察是否可以互换,在这样的情况下,我们可以说它们是同质的。同质性可能是组合算法的最终目标,或者不组合的理由。除了随机矩阵理论方法外,很少有一般方法提供可互换性或同质性的统计保证,而没有不同组群的标签例子。我们建议一种快速和灵活的非参数假设测试方法,将一个多变量的单细胞逐项数据集和用户指定的特征依赖性限制作为输入,不标注示例,并报告个人是否可在用户指定的意义水平上互换。我们的方法控制了I类在现实情景上的错误,并处理任意性的数据。我们进行了广泛的模拟研究,以评价对分层的域-异性测试的功效,发现我们的方法在各种利益假设中比较有利。最后,我们采用我们的方法,将后组合单细胞可获取性数据和世界价值调查数据作为输入的方法,并显示它如何有助于确定异性个体的驱动器。

0

相关内容

可交换的

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

Effective.Modern.C++ 中英文版，334页pdf

Effective.Modern.C++ 中英文版，334页pdf

专知会员服务

68+阅读 · 2020年11月4日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

已删除

将门创投

4+阅读 · 2017年11月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

The Impact of Heterogeneity and Geometry on the Proof Complexity of Random Satisfiability

Arxiv

0+阅读 · 2021年11月23日

Estimating Individual Treatment Effects using Non-Parametric Regression Models: a Review

Arxiv

0+阅读 · 2021年11月23日

Flexible Bayesian Nonlinear Model Configuration

Arxiv

0+阅读 · 2021年11月23日

A comparison of different clustering approaches for high-dimensional presence-absence data

Arxiv

0+阅读 · 2021年11月22日

The R2D2 Prior for Generalized Linear Mixed Models

Arxiv

0+阅读 · 2021年11月21日

$Optimization-based parametric model order reduction via $\mathcal{H}_2\otimes\mathcal{L}_2$ first-order necessary conditions$

Optimization-based parametric model order reduction via $\mathcal{H}_2\otimes\mathcal{L}_2$ first-order necessary conditions

Arxiv

0+阅读 · 2021年11月19日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

MSc Dissertation: Exclusive Row Biclustering for Gene Expression Using a Combinatorial Auction Approach

MSc Dissertation: Exclusive Row Biclustering for Gene Expression Using a Combinatorial Auction Approach

Arxiv

6+阅读 · 2018年9月13日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

相关VIP内容

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

Effective.Modern.C++ 中英文版，334页pdf

Effective.Modern.C++ 中英文版，334页pdf

专知会员服务

68+阅读 · 2020年11月4日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】利用强化学习与生成模型推进可靠且可泛化的决策

美海军研发“增强侦察与态势评估系统（ARES）”应用程序以优化作战规划（附研究论文）

【NeurIPS2025】DNA-DetectLLM：基于 DNA 启发的“突变-修复”范式揭示 AI 生成文本

面向深度研究系统的强化学习基础：综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

已删除

将门创投

4+阅读 · 2017年11月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

The Impact of Heterogeneity and Geometry on the Proof Complexity of Random Satisfiability

Arxiv

0+阅读 · 2021年11月23日

Estimating Individual Treatment Effects using Non-Parametric Regression Models: a Review

Arxiv

0+阅读 · 2021年11月23日

Flexible Bayesian Nonlinear Model Configuration

Arxiv

0+阅读 · 2021年11月23日

A comparison of different clustering approaches for high-dimensional presence-absence data

Arxiv

0+阅读 · 2021年11月22日

The R2D2 Prior for Generalized Linear Mixed Models

Arxiv

0+阅读 · 2021年11月21日

$Optimization-based parametric model order reduction via $\mathcal{H}_2\otimes\mathcal{L}_2$ first-order necessary conditions$

Optimization-based parametric model order reduction via $\mathcal{H}_2\otimes\mathcal{L}_2$ first-order necessary conditions

Arxiv

0+阅读 · 2021年11月19日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

MSc Dissertation: Exclusive Row Biclustering for Gene Expression Using a Combinatorial Auction Approach

MSc Dissertation: Exclusive Row Biclustering for Gene Expression Using a Combinatorial Auction Approach

Arxiv

6+阅读 · 2018年9月13日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

微信扫码咨询专知VIP会员