Graph Machine Learning (GML) with Graph Databases (GDBs) has gained significant relevance in recent years, due to its ability to handle complex interconnected data and apply ML techniques using Graph Data Science (GDS). However, a critical gap exists in the current way GDB-GML applications analyze data, especially in terms of Knowledge Completion (KC) in Knowledge Graphs (KGs). In particular, current architectures ignore KC, working on datasets that appear incomplete or fragmented, despite they actually contain valuable hidden knowledge. This limitation may cause wrong interpretations when these data are used as input for GML models. This paper proposes an innovative architecture that integrates a KC phase into GDB-GML applications, demonstrating how revealing hidden knowledge can heavily impact datasets' behavior and metrics. For this purpose, we introduce scalable transitive relationships, which are links that propagate information over the network and modelled by a decay function, allowing a deterministic knowledge flows across multiple nodes. Experimental results demonstrate that our intuition radically reshapes both topology and overall dataset dynamics, underscoring the need for this new GDB-GML architecture to produce better models and unlock the full potential of graph-based data analysis.
翻译:近年来,基于图数据库的图机器学习因其处理复杂互联数据并利用图数据科学应用机器学习技术的能力而备受关注。然而,当前图数据库-图机器学习应用在数据分析方式上存在关键缺陷,尤其在知识图谱的知识补全方面。具体而言,现有架构忽视了知识补全环节,处理看似不完整或碎片化的数据集,尽管这些数据实际蕴含宝贵的隐藏知识。这一局限可能导致数据作为图机器学习模型输入时产生错误解读。本文提出一种创新架构,将知识补全阶段整合至图数据库-图机器学习应用中,论证揭示隐藏知识如何显著影响数据集行为与评估指标。为此,我们引入可扩展的传递关系——通过衰减函数建模、在网络中传播信息的连接机制,实现跨多节点的确定性知识流。实验结果表明,该设计思路从根本上重塑了拓扑结构与整体数据集动态,凸显了这种新型图数据库-图机器学习架构对于构建更优模型、释放图数据分析全部潜力的必要性。