大语言模型中的幻觉检测：基于内部状态与结构化推理一致性的方法 (Hallucination Detection via Internal States and Structured Reasoning Consistency in Large Language Models)

The detection of sophisticated hallucinations in Large Language Models (LLMs) is hampered by a ``Detection Dilemma'': methods probing internal states (Internal State Probing) excel at identifying factual inconsistencies but fail on logical fallacies, while those verifying externalized reasoning (Chain-of-Thought Verification) show the opposite behavior. This schism creates a task-dependent blind spot: Chain-of-Thought Verification fails on fact-intensive tasks like open-domain QA where reasoning is ungrounded, while Internal State Probing is ineffective on logic-intensive tasks like mathematical reasoning where models are confidently wrong. We resolve this with a unified framework that bridges this critical gap. However, unification is hindered by two fundamental challenges: the Signal Scarcity Barrier, as coarse symbolic reasoning chains lack signals directly comparable to fine-grained internal states, and the Representational Alignment Barrier, a deep-seated mismatch between their underlying semantic spaces. To overcome these, we introduce a multi-path reasoning mechanism to obtain more comparable, fine-grained signals, and a segment-aware temporalized cross-attention module to adaptively fuse these now-aligned representations, pinpointing subtle dissonances. Extensive experiments on three diverse benchmarks and two leading LLMs demonstrate that our framework consistently and significantly outperforms strong baselines. Our code is available: https://github.com/peach918/HalluDet.

翻译：大语言模型（LLM）中复杂幻觉的检测受限于一种“检测困境”：探查内部状态的方法（内部状态探查）擅长识别事实不一致性，但在逻辑谬误上失效；而验证外显推理的方法（思维链验证）则表现出相反的行为。这种分裂造成了任务依赖的盲区：思维链验证在事实密集型任务（如开放域问答）上失效，因为其推理缺乏事实依据；而内部状态探查在逻辑密集型任务（如数学推理）上效果不佳，因为模型会自信地产生错误答案。我们通过一个统一框架来解决这一关键缺口。然而，统一过程面临两个根本性挑战：信号稀缺障碍——粗粒度的符号推理链缺乏可与细粒度内部状态直接比较的信号；以及表征对齐障碍——两者底层语义空间之间存在根深蒂固的失配。为克服这些障碍，我们引入了多路径推理机制以获取更具可比性的细粒度信号，并采用分段感知的时序化交叉注意力模块来自适应融合这些现已对齐的表征，从而精确定位细微的不一致。在三个多样化基准测试和两个领先LLM上进行的大量实验表明，我们的框架一致且显著地优于强基线方法。我们的代码已开源：https://github.com/peach918/HalluDet。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日