Scaling Law论文 - 专知

会员服务 ·

Scaling Law

从目前的研究总结发现，模型规模的扩展是LLM能力提升的一个关键因素。从GPT-3的175B参数量到PaLM的540B记录，都验证了模型规模的扩展，导致能力的提升。当然，大的模型尺寸是必不可少的，但是扩展定律并不仅限于此，它一共包括三个方面：模型尺寸（Model size）数据规模（Data size）总计算量（Total compute）此外，预训练数据的质量在保证模型性能方面有着关键作用，因此在扩展语料库时，要注意数据收集和清理的策略。

Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches

Arxiv

0+阅读 · 3月3日

Scaling Large Language Model-based Multi-Agent Collaboration

Arxiv

0+阅读 · 3月17日

Scaling Large-Language-Model-based Multi-Agent Collaboration

Arxiv

0+阅读 · 2月28日

Unsourced Random Access in MIMO Quasi-Static Rayleigh Fading Channels: Finite Blocklength and Scaling Law Analyses

Arxiv

0+阅读 · 3月21日

Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families

Arxiv

0+阅读 · 2月5日

Parametric Scaling Law of Tuning Bias in Conformal Prediction

Arxiv

0+阅读 · 2月5日

How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines

Arxiv

0+阅读 · 2月17日

An Efficient Large Recommendation Model: Towards a Resource-Optimal Scaling Law

Arxiv

0+阅读 · 2月14日

Unlocking Scaling Law in Industrial Recommendation Systems with a Three-step Paradigm based Large User Model

Arxiv

0+阅读 · 2月12日

Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families

Arxiv

0+阅读 · 2024年12月25日

The Scaling Law for LoRA Base on Mutual Information Upper Bound

Arxiv

0+阅读 · 1月6日

Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families

Arxiv

1+阅读 · 2024年12月23日

P$^2$ Law: Scaling Law for Post-Training After Model Pruning

Arxiv

1+阅读 · 2024年12月16日

Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families

Arxiv

0+阅读 · 2024年12月9日

Scalable Analysis of Urban Scaling Laws: Leveraging Cloud Computing to Analyze 21,280 Global Cities

Arxiv

0+阅读 · 2024年12月3日

参考链接

微信扫码咨询专知VIP会员