In today's rapidly expanding data landscape, knowledge extraction from unstructured text is vital for real-time analytics, temporal inference, and dynamic memory frameworks. However, traditional static knowledge graph (KG) construction often overlooks the dynamic and time-sensitive nature of real-world data, limiting adaptability to continuous changes. Moreover, recent zero- or few-shot approaches that avoid domain-specific fine-tuning or reliance on prebuilt ontologies often suffer from instability across multiple runs, as well as incomplete coverage of key facts. To address these challenges, we introduce ATOM (AdapTive and OptiMized), a few-shot and scalable approach that builds and continuously updates Temporal Knowledge Graphs (TKGs) from unstructured texts. ATOM splits input documents into minimal, self-contained "atomic" facts, improving extraction exhaustivity and stability. Then, it constructs atomic TKGs from these facts while employing a dual-time modeling that distinguishes when information is observed from when it is valid. The resulting atomic TKGs are subsequently merged in parallel. Empirical evaluations demonstrate that ATOM achieves ~18% higher exhaustivity, ~17% better stability, and over 90% latency reduction compared to baseline methods, demonstrating a strong scalability potential for dynamic TKG construction.
翻译:在当前快速扩张的数据环境中,从非结构化文本中提取知识对于实时分析、时序推理和动态记忆框架至关重要。然而,传统的静态知识图谱构建方法往往忽视了现实世界数据的动态性和时效性,限制了其对持续变化的适应性。此外,近期避免领域特定微调或依赖预构建本体的零样本或少样本方法,通常存在多次运行间的不稳定性以及对关键事实覆盖不完整的问题。为解决这些挑战,我们提出了ATOM(自适应与优化),一种少样本且可扩展的方法,能够从非结构化文本中构建并持续更新时序知识图谱。ATOM将输入文档拆分为最小化、自包含的“原子”事实,从而提升提取的完备性和稳定性。随后,基于这些事实构建原子时序知识图谱,并采用双时间建模来区分信息被观测的时间与其有效的时间。生成的原子时序知识图谱随后进行并行融合。实证评估表明,与基线方法相比,ATOM实现了约18%的更高完备性、约17%的更好稳定性以及超过90%的延迟降低,展现出动态时序知识图谱构建的强大可扩展潜力。