Knowledge gaps and hallucinations are persistent challenges for Large Language Models (LLMs), which generate unreliable responses when lacking the necessary information to fulfill user instructions. Existing approaches, such as Retrieval-Augmented Generation (RAG) and tool use, aim to address these issues by incorporating external knowledge. Yet, they rely on additional models or services, resulting in complex pipelines, potential error propagation, and often requiring the model to process a large number of tokens. In this paper, we present a scalable method that enables LLMs to access external knowledge without depending on retrievers or auxiliary models. Our approach uses constrained generation with a pre-built prefix-tree index. Triples from a Knowledge Graph are verbalized in textual facts, tokenized, and indexed in a prefix tree for efficient access. During inference, to acquire external knowledge, the LLM generates facts with constrained generation which allows only sequences of tokens that form an existing fact. We evaluate our proposal on Question Answering and show that it scales to large knowledge bases (800 million facts), adapts to domain-specific data, and achieves effective results. These gains come with minimal generation-time overhead. ReFactX code is available at https://github.com/rpo19/ReFactX.
翻译:知识缺失与幻觉是大型语言模型(LLMs)面临的持续挑战,当模型缺乏必要信息来满足用户指令时,会产生不可靠的响应。现有方法如检索增强生成(RAG)和工具调用试图通过整合外部知识来解决这些问题,但它们依赖额外模型或服务,导致流程复杂、存在错误传播风险,且通常需要模型处理大量令牌。本文提出一种可扩展的方法,使LLMs能够在不依赖检索器或辅助模型的情况下访问外部知识。我们的方法采用基于预构建前缀树索引的约束生成技术:将知识图谱中的三元组转化为文本事实并进行令牌化,在前缀树中建立索引以实现高效访问。在推理过程中,为获取外部知识,LLM通过约束生成产生事实——仅允许生成构成现有事实的令牌序列。我们在问答任务上评估了该方案,结果表明该方法可扩展至大规模知识库(8亿条事实),适应领域特定数据,并取得显著效果。这些优势仅需极低的生成时开销。ReFactX代码已发布于 https://github.com/rpo19/ReFactX。