Community detection involves grouping the nodes in the network and is one of the most-studied tasks in network science. Conventional methods usually require the specification of the number of communities $K$ in the network. This number is determined heuristically or by certain model selection criteria. In practice, different model selection criteria yield different values of $K$, leading to different results. We propose a community detection method based on recursive partitioning within the Bayesian framework. The method is compatible with a wide range of existing model-based community detection frameworks. In particular, our method does not require pre-specification of the number of communities and can capture the hierarchical structure of the network. We establish the theoretical guarantee of consistency under the stochastic block model and demonstrate the effectiveness of our method through simulations using different models that cover a broad range of scenarios. We apply our method to the California Department of Healthcare Access and Information (HCAI) data, including all Emergency Department (ED) and hospital discharges from 342 hospitals to identify regional hospital clusters.
翻译:社区检测旨在对网络中的节点进行分组,是网络科学中研究最广泛的任务之一。传统方法通常需要预先指定网络中的社区数量$K$。该数量通常通过启发式方法或特定模型选择准则确定。实践中,不同的模型选择准则会产生不同的$K$值,从而导致不同的检测结果。我们提出了一种在贝叶斯框架内基于递归划分的社区检测方法。该方法兼容多种现有的基于模型的社区检测框架。特别地,我们的方法无需预先指定社区数量,且能够捕捉网络的层次结构。我们在随机块模型下建立了方法一致性的理论保证,并通过涵盖广泛场景的不同模型仿真验证了方法的有效性。我们将该方法应用于加州医疗保健获取与信息部(HCAI)数据——包括来自342家医院的所有急诊科(ED)和住院出院记录——以识别区域性医院集群。