The detection of sophisticated hallucinations in Large Language Models (LLMs) is hampered by a ``Detection Dilemma'': methods probing internal states (Internal State Probing) excel at identifying factual inconsistencies but fail on logical fallacies, while those verifying externalized reasoning (Chain-of-Thought Verification) show the opposite behavior. This schism creates a task-dependent blind spot: Chain-of-Thought Verification fails on fact-intensive tasks like open-domain QA where reasoning is ungrounded, while Internal State Probing is ineffective on logic-intensive tasks like mathematical reasoning where models are confidently wrong. We resolve this with a unified framework that bridges this critical gap. However, unification is hindered by two fundamental challenges: the Signal Scarcity Barrier, as coarse symbolic reasoning chains lack signals directly comparable to fine-grained internal states, and the Representational Alignment Barrier, a deep-seated mismatch between their underlying semantic spaces. To overcome these, we introduce a multi-path reasoning mechanism to obtain more comparable, fine-grained signals, and a segment-aware temporalized cross-attention module to adaptively fuse these now-aligned representations, pinpointing subtle dissonances. Extensive experiments on three diverse benchmarks and two leading LLMs demonstrate that our framework consistently and significantly outperforms strong baselines. Our code is available: https://github.com/peach918/HalluDet.
翻译:大语言模型(LLM)中复杂幻觉的检测受限于一种“检测困境”:探查内部状态的方法(内部状态探查)擅长识别事实不一致性,但在逻辑谬误上失效;而验证外显推理的方法(思维链验证)则表现出相反的行为。这种分裂造成了任务依赖的盲区:思维链验证在事实密集型任务(如开放域问答)上失效,因为其推理缺乏事实依据;而内部状态探查在逻辑密集型任务(如数学推理)上效果不佳,因为模型会自信地产生错误答案。我们通过一个统一框架来解决这一关键缺口。然而,统一过程面临两个根本性挑战:信号稀缺障碍——粗粒度的符号推理链缺乏可与细粒度内部状态直接比较的信号;以及表征对齐障碍——两者底层语义空间之间存在根深蒂固的失配。为克服这些障碍,我们引入了多路径推理机制以获取更具可比性的细粒度信号,并采用分段感知的时序化交叉注意力模块来自适应融合这些现已对齐的表征,从而精确定位细微的不一致。在三个多样化基准测试和两个领先LLM上进行的大量实验表明,我们的框架一致且显著地优于强基线方法。我们的代码已开源:https://github.com/peach918/HalluDet。