AIR：基于注意力头影响力的推理后训练数据选择方法 (AIR: Post-training Data Selection for Reasoning via Attention Head Influence)

LLMs achieve remarkable multi-step reasoning capabilities, yet effectively transferring these skills via post-training distillation remains challenging. Existing data selection methods, ranging from manual curation to heuristics based on length, entropy, or overall loss, fail to capture the causal importance of individual reasoning steps, limiting distillation efficiency. To address this, we propose Attention Influence for Reasoning (AIR), a principled, unsupervised and training-free framework that leverages mechanistic insights of the retrieval head to select high-value post-training data. AIR first identifies reasoning-critical attention heads of an off-the-shelf model, then constructs a weakened reference model with disabled head influence, and finally quantifies the resulting loss divergence as the Attention Influence Score. This score enables fine-grained assessment at both the step and sample levels, supporting step-level weighted fine-tuning and global sample selection. Experiments across multiple reasoning benchmarks show that AIR consistently improves reasoning accuracy, surpassing heuristic baselines and effectively isolating the most critical steps and samples. Our work establishes a mechanism-driven, data-efficient approach for reasoning distillation in LLMs.

翻译：大语言模型展现出卓越的多步推理能力，但通过后训练蒸馏有效迁移这些技能仍具挑战性。现有数据选择方法——从人工筛选到基于长度、熵或整体损失的启发式策略——均未能捕捉个体推理步骤的因果重要性，限制了蒸馏效率。为此，我们提出推理注意力影响力（AIR），一种基于机理洞察、无监督且无需训练的原则性框架，利用检索头的机制特性选择高价值后训练数据。AIR首先识别现成模型中推理关键注意力头，随后构建一个禁用头部影响力的弱化参考模型，最终将产生的损失差异量化为注意力影响力分数。该分数支持步骤级和样本级的细粒度评估，可用于步骤级加权微调与全局样本选择。在多个推理基准测试上的实验表明，AIR持续提升推理准确率，超越启发式基线方法，并能有效识别最关键的步骤与样本。本研究为LLM推理蒸馏建立了一种机理驱动、数据高效的新途径。