Despite displaying semantic competence, large language models' internal mechanisms that ground abstract semantic structure remain insufficiently characterised. We propose a method integrating role-cross minimal pairs, temporal emergence analysis, and cross-model comparison to study how LLMs implement semantic roles. Our analysis uncovers: (i) highly concentrated circuits (89-94% attribution within 28 nodes); (ii) gradual structural refinement rather than phase transitions, with larger models sometimes bypassing localised circuits; and (iii) moderate cross-scale conservation (24-59% component overlap) alongside high spectral similarity. These findings suggest that LLMs form compact, causally isolated mechanisms for abstract semantic structure, and these mechanisms exhibit partial transfer across scales and architectures.
翻译:尽管大语言模型展现出语义理解能力,但其内部实现抽象语义结构的基础机制仍未被充分揭示。我们提出了一种整合角色交叉最小对、时序涌现分析和跨模型比较的方法,以研究大语言模型如何实现语义角色。我们的分析揭示了:(i)高度集中的回路(在28个节点内实现89-94%的归因);(ii)结构逐步精细化而非发生相变,且更大模型有时会绕过局部化回路;(iii)中等程度的跨尺度守恒性(组件重叠率为24-59%)以及较高的谱相似性。这些发现表明,大语言模型形成了紧凑且因果隔离的抽象语义结构机制,并且这些机制在不同规模和架构间表现出部分可迁移性。