Community detection in real-world networks is typically addressed through the use of graph clustering methods that partition the nodes of a network into disjoint subsets. While the definition of community may vary, it is generally accepted that elements of a community should be ``well-connected". We evaluated clusters generated by the Leiden algorithm and the Iterative K-core (IKC) clustering algorithm for their susceptibility to become disconnected by the deletion of a small number of edges. A striking observation is that for Leiden clustering of real-world networks, except for cases with large resolution parameter values, the majority of clusters do not meet the relatively mild condition we enforce for well-connected clusters. We also constructed a modular pipeline to enable well-connected output clusters that allows a user-specified criterion for a valid community considering cluster size and minimum edge cut size and describe the use of this pipeline on real world and synthetic networks. An interesting trend we observed is that the final clusterings on real-world networks had small node coverage, suggesting that not all nodes in a network belong in communities.
翻译:在现实世界网络中,社区探测通常通过使用图形组合法来解决,将网络的节点分割成互不相连的子集。虽然社区的定义可能各不相同,但普遍认为社区的组成部分应该“紧密连接”。我们评估了莱顿算法和循环K-核心(IKC)群集算法产生的群集,使它们的易感性因删除少量边缘而断开。一个引人注目的观察是,除了有大量分辨率参数值的情况外,实际世界网络的莱登群集,大多数群集没有达到我们对连接良好的群集执行的相对温和的条件。我们还建造了一个模块化管道,使连接良好的产出群群成为有效的社区的一个用户指定标准,考虑到集群的规模和最小边缘削减大小,并描述这个管道在现实世界和合成网络中的使用情况。我们观察到的一个有趣的趋势是,现实世界网络的最后群集有小节点覆盖,表明网络中并非所有节点都属于社区。</s>