Recent advances in Large Language Models (LLMs) - particularly model scaling and test-time techniques - have greatly enhanced the reasoning capabilities of language models at the expense of higher inference costs. To lower inference costs, prior works train router models or deferral mechanisms that allocate easy queries to a small, efficient model, while forwarding harder queries to larger, more expensive models. However, these trained router models often lack robustness under domain shifts and require expensive data synthesis techniques such as Monte Carlo rollouts to obtain sufficient ground-truth routing labels for training. In this work, we propose Confidence-Guided Stepwise Model Routing for Cost-Efficient Reasoning (STEER), a domain-agnostic framework that performs fine-grained, step-level routing between smaller and larger LLMs without utilizing external models. STEER leverages confidence scores from the smaller model's logits prior to generating a reasoning step, so that the large model is invoked only when necessary. Extensive evaluations using different LLMs on a diverse set of challenging benchmarks across multiple domains such as Mathematical Reasoning, Multi-Hop QA, and Planning tasks indicate that STEER achieves competitive or enhanced accuracy while reducing inference costs (up to +20% accuracy with 48% less FLOPs compared to solely using the larger model on AIME), outperforming baselines that rely on trained external modules. Our results establish model-internal confidence as a robust, domain-agnostic signal for model routing, offering a scalable pathway for efficient LLM deployment.
翻译:大型语言模型(LLMs)的最新进展——特别是模型规模扩展和测试时技术——显著增强了语言模型的推理能力,但同时也带来了更高的推理成本。为降低推理成本,先前的研究训练了路由模型或延迟机制,将简单查询分配给小型高效模型,而将困难查询转发给更大、更昂贵的模型。然而,这些训练的路由模型在领域转移下往往缺乏鲁棒性,并且需要昂贵的数据合成技术(如蒙特卡洛推演)来获取足够的真实路由标签用于训练。在本研究中,我们提出了置信度引导的逐步模型路由用于成本高效推理(STEER),这是一个领域无关的框架,可在小型和大型LLMs之间执行细粒度的步骤级路由,而无需使用外部模型。STEER利用小型模型在生成推理步骤前从其logits中获得的置信度分数,从而仅在必要时调用大型模型。在数学推理、多跳问答和规划任务等多个领域的多样化挑战性基准测试中,使用不同LLMs进行的广泛评估表明,STEER在降低推理成本的同时实现了竞争性或更高的准确率(例如在AIME基准上,与仅使用大型模型相比,准确率提升高达+20%,且FLOPs减少48%),优于依赖训练外部模块的基线方法。我们的研究结果确立了模型内部置信度作为模型路由的鲁棒、领域无关信号,为高效LLM部署提供了一条可扩展的路径。