Detecting multimodality in empirical distributions is a fundamental problem in statistics and data analysis, with applications ranging from clustering to social science. Hartigan's Dip Test is a classical nonparametric procedure for testing unimodality versus multimodality, but its interpretation is hindered by strong dependence on sample size and the need for lookup tables. We introduce the Z-Dip, a standardized extension of the Dip Test that removes sample-size dependence by comparing observed Dip values to simulated null distributions. We calibrate a universal decision threshold for Z-Dip via simulation and bootstrap resampling, providing a unified criterion for multimodality detection. In the final section, we also propose a downsampling-based approach to further mitigate residual sample-size effects in very large datasets. Lookup tables and software implementations are made available for efficient use in practice.
翻译:检测经验分布中的多峰性是统计学与数据分析中的基础问题,其应用范围涵盖聚类分析至社会科学。Hartigan的Dip检验是用于检验单峰性与多峰性的经典非参数方法,但其解释受到样本量高度依赖性和需查表使用的限制。我们提出了Z-Dip,作为Dip检验的标准化扩展,通过将观测到的Dip值与模拟零分布进行比较,消除了样本量依赖性。我们通过模拟和自助重采样校准了Z-Dip的通用决策阈值,为多峰性检测提供了统一标准。在最后部分,我们还提出了一种基于下采样的方法,以进一步减轻超大样本数据中残留的样本量效应。查表工具和软件实现已提供,便于实际高效使用。