AI进展应以单位资源能力而非单纯规模衡量：一种面向LLM的梯度引导资源分配框架 (AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs)

from arxiv, 9 pages (main) + appendix, 3 figures. Accepted at NeurIPS 2025 (Position Paper Track), submission #491. OpenReview: https://openreview.net/forum?id=6plSmhBI33&noteId=KP5ZqY7JLg

This position paper challenges the "scaling fundamentalism" dominating AI research, where unbounded growth in model size and computation has led to unsustainable environmental impacts and widening resource inequality. We argue that LLM development should be fundamentally reoriented toward capability-per-resource rather than capability alone. We present a theoretical framework demonstrating that resource-allocation decisions guided by gradient influence patterns can dramatically improve efficiency throughout the AI lifecycle. Our analysis shows that in transformer-based models, where a small fraction of parameters exert outsized influence (following heavy-tailed distributions), three critical insights emerge: (1) updating only high-influence parameters strictly outperforms full-parameter tuning on a performance-per-resource basis; (2) simple gradient norms provide computationally efficient proxies for identifying these high-influence components; and (3) coordinated parameter and data selection yields multiplicative efficiency gains, potentially reducing resource requirements by orders of magnitude. Building on these theoretical foundations, we propose a two stage paradigm marginal-return pretraining for foundation developers and influence guided adaptation for downstream users bridged by gradient blueprints, metadata describing which parameters matter most for various tasks. This capability-per-resource perspective transforms what were once considered pragmatic hardware workarounds into theoretically optimal strategies, democratizing access to cutting-edge AI capabilities while significantly reducing environmental impact. By embedding resource consciousness into how we develop, adapt, and evaluate models, we can reshape AI progress toward a more sustainable and equitable future.

翻译：本立场论文挑战当前主导AI研究的“规模至上主义”——模型尺寸与计算量的无限增长已导致不可持续的环境影响与日益加剧的资源不平等。我们主张，大语言模型的发展应从根本上转向以单位资源能力而非单纯能力为衡量核心。我们提出一个理论框架，证明基于梯度影响模式的资源分配决策能显著提升AI全生命周期的效率。分析表明，在基于Transformer的模型中，少量参数（服从重尾分布）具有超常影响力，由此可得出三个关键结论：（1）仅更新高影响力参数在单位资源性能上严格优于全参数调优；（2）简单的梯度范数为识别这些高影响力组件提供了计算高效的代理指标；（3）参数与数据的协同选择可产生乘数级效率增益，可能将资源需求降低数个数量级。基于此理论，我们提出一个两阶段范式：面向基础开发者的边际收益预训练与面向下游用户的梯度引导适配，二者通过梯度蓝图（描述不同任务中关键参数的元数据）相衔接。这种单位资源能力视角将曾被视作实用硬件变通方案的方法转化为理论最优策略，在显著降低环境影响的同时，使前沿AI能力更易普及。通过将资源意识嵌入模型开发、适配与评估的全过程，我们能够将AI发展重塑为更可持续、更公平的未来。

相关内容

关注 7073

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

DeepSeek模型综述：V1 V2 V3 R1-Zero

专知会员服务

116+阅读 · 2月11日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

38+阅读 · 2020年5月30日