Finding potential research collaborators is a challenging task, especially in today's fast-growing and interdisciplinary research landscape. While traditional methods often rely on observable relationships such as co-authorships and citations to construct the research network, in this work, we focus solely on publication content to build a topic-based research network using BERTopic with a fine-tuned SciBERT model that connects and recommends researchers across disciplines based on shared topical interests. A major challenge we address is publication imbalance, where some researchers publish much more than others, often across several topics. Without careful handling, their less frequent interests are hidden under dominant topics, limiting the network's ability to detect their full research scope. To tackle this, we introduce a cloning strategy that clusters a researcher's publications and treats each cluster as a separate node. This allows researchers to be part of multiple communities, improving the detection of interdisciplinary links. Evaluation on the proposed method shows that the cloned network structure leads to more meaningful communities and uncovers a broader set of collaboration opportunities.
翻译:在当今快速发展和跨学科的研究环境中,寻找潜在的研究合作者是一项具有挑战性的任务。传统方法通常依赖于可观察的关系(如合著关系和引用关系)来构建研究网络,而在本工作中,我们仅关注发表内容,利用BERTopic与微调后的SciBERT模型构建了一个基于主题的研究网络,该网络能够根据共同的主题兴趣连接并推荐跨学科的研究人员。我们解决的一个主要挑战是发表不均衡问题,即一些研究者比其他人发表得多得多,且常常涉及多个主题。若不加以谨慎处理,他们较不频繁的兴趣会被主导主题所掩盖,从而限制了网络检测其完整研究范围的能力。为了解决这个问题,我们引入了一种克隆策略:对研究者的出版物进行聚类,并将每个聚类视为一个独立的节点。这使得研究者可以成为多个社区的成员,从而提升了对跨学科链接的检测能力。对所提方法的评估表明,克隆后的网络结构能够产生更有意义的社区,并揭示更广泛的合作机会。