Community structure in social and collaborative networks often emerges from a complex interplay between structural mechanisms, such as degree heterogeneity and leader-driven attraction, and homophily on node attributes. Existing community detection methods typically focus on these dimensions in isolation, limiting their ability to recover interpretable communities in presence of such mechanisms. In this paper, we propose AttDeCoDe, an attribute-driven extension of a density-based community detection framework, developed to analyse networks where node characteristics play a central role in group formation. Instead of defining density purely from network topology, AttDeCoDe estimates node-wise density in the attribute space, allowing communities to form around attribute-based community representatives while preserving structural connectivity constraints. This approach naturally captures homophily-driven aggregation while remaining sensitive to leader influence. We evaluate the proposed method through a simulation study based on a novel generative model that extends the degree-corrected stochastic block model by incorporating attribute-driven leader attraction, reflecting key features of collaborative research networks. We perform an empirical application to research collaboration data from the Horizon programmes, where organisations are characterised by project-level thematic descriptors. Both results show that AttDeCoDe offers a flexible and interpretable framework for community detection in attributed networks achieving competitive performance relative to topology-based and attribute-assisted benchmarks.
翻译:社交与协作网络中的社区结构通常源于结构机制(如度异质性和领导者驱动吸引力)与节点属性同质性之间复杂的相互作用。现有的社区检测方法通常孤立地关注这些维度,限制了其在存在此类机制时恢复可解释社区的能力。本文提出AttDeCoDe——一种基于密度的社区检测框架的属性驱动扩展,该框架专为分析节点特征在群体形成中起核心作用的网络而开发。AttDeCoDe并非纯粹从网络拓扑定义密度,而是在属性空间中估计节点级密度,使得社区能够围绕基于属性的社区代表形成,同时保持结构连通性约束。该方法自然地捕捉了同质性驱动的聚合,同时对领导者影响保持敏感。我们通过基于新型生成模型的模拟研究评估所提出的方法,该模型通过纳入属性驱动的领导者吸引力扩展了度校正随机块模型,反映了协作研究网络的关键特征。我们将该方法应用于"地平线"计划的研究协作数据,其中组织通过项目级主题描述符进行表征。两项结果均表明,AttDeCoDe为属性网络中的社区检测提供了一个灵活且可解释的框架,相较于基于拓扑和属性辅助的基准方法实现了具有竞争力的性能。