Recent large language models have demonstrated relevant capabilities in solving problems that require logical reasoning; however, the corresponding internal mechanisms remain largely unexplored. In this paper, we show that a small language model can solve a deductive reasoning task by learning the underlying rules (rather than operating as a statistical learner). A low-level explanation of its internal representations and computational circuits is then provided. Our findings reveal that induction heads play a central role in the implementation of the rule completion and rule chaining steps involved in the logical inference required by the task.
翻译:近期的大型语言模型在解决需要逻辑推理的问题上展现出相关能力;然而,其对应的内部机制在很大程度上仍未得到探索。本文表明,一个小型语言模型能够通过学习底层规则(而非作为统计学习器运行)来解决演绎推理任务。随后,我们对其内部表示和计算回路提供了低层次的解释。我们的研究结果表明,归纳头在实现任务所需的逻辑推理所涉及的规则补全和规则链接步骤中发挥着核心作用。