Large language models often hallucinate when processing long and noisy retrieval contexts because they rely on spurious correlations rather than genuine causal relationships. We propose CIP, a lightweight and plug-and-play causal prompting framework that mitigates hallucinations at the input stage. CIP constructs a causal relation sequence among entities, actions, and events and injects it into the prompt to guide reasoning toward causally relevant evidence. Through causal intervention and counterfactual reasoning, CIP suppresses non causal reasoning paths, improving factual grounding and interpretability. Experiments across seven mainstream language models, including GPT-4o, Gemini 2.0 Flash, and Llama 3.1, show that CIP consistently enhances reasoning quality and reliability, achieving 2.6 points improvement in Attributable Rate, 0.38 improvement in Causal Consistency Score, and a fourfold increase in effective information density. API level profiling further shows that CIP accelerates contextual understanding and reduces end to end response latency by up to 55.1 percent. These results suggest that causal reasoning may serve as a promising paradigm for improving the explainability, stability, and efficiency of large language models.
翻译:大型语言模型在处理长且嘈杂的检索上下文时常常产生幻觉,因为它们依赖于虚假相关性而非真实的因果关系。我们提出CIP,一种轻量级、即插即用的因果提示框架,可在输入阶段缓解幻觉。CIP构建实体、动作和事件之间的因果关联序列,并将其注入提示中以引导推理朝向因果相关的证据。通过因果干预和反事实推理,CIP抑制非因果推理路径,从而提升事实基础和可解释性。在包括GPT-4o、Gemini 2.0 Flash和Llama 3.1在内的七种主流语言模型上的实验表明,CIP持续提升推理质量和可靠性,在可归因率上实现2.6分的提升,因果一致性得分提高0.38,有效信息密度增加四倍。API级别的性能分析进一步显示,CIP加速了上下文理解,并将端到端响应延迟降低高达55.1%。这些结果表明,因果推理可能作为一种有前景的范式,用于提升大型语言模型的可解释性、稳定性和效率。