并行令牌预测：语言模型的并行序列生成框架 (Parallel Token Prediction for Language Models) - 专知论文

会员服务 ·

0

并行 · 令牌 · 序列 · 序列生成 · 解码 ·

Parallel Token Prediction for Language Models

翻译：并行令牌预测：语言模型的并行序列生成框架

Felix Draxler,Justus Will,Farrin Marouf Sofian,Theofanis Karaletsos,Sameer Singh,Stephan Mandt

from arxiv, Preprint. Under review

We propose Parallel Token Prediction (PTP), a universal framework for parallel sequence generation in language models. PTP jointly predicts multiple dependent tokens in a single transformer call by incorporating the sampling procedure into the model. This reduces the latency bottleneck of autoregressive decoding, and avoids the restrictive independence assumptions common in existing multi-token prediction methods. We prove that PTP can represent arbitrary autoregressive sequence distributions. PTP is trained either by distilling an existing model or through inverse autoregressive training without a teacher. Experimentally, we achieve state-of-the-art speculative decoding performance on Vicuna-7B by accepting over four tokens per step on Spec-Bench. The universality of our framework indicates that parallel generation of long sequences is feasible without loss of modeling power.

翻译：我们提出并行令牌预测（PTP），一种用于语言模型并行序列生成的通用框架。PTP通过将采样过程整合到模型中，在单次Transformer调用中联合预测多个相互依赖的令牌。这降低了自回归解码的延迟瓶颈，并避免了现有多令牌预测方法中常见的限制性独立假设。我们证明PTP能够表示任意的自回归序列分布。PTP可通过蒸馏现有模型或无教师逆向自回归训练进行训练。实验表明，我们在Spec-Bench上以每步接受超过四个令牌的速率，在Vicuna-7B模型上实现了最先进的推测解码性能。该框架的通用性表明，长序列的并行生成在不损失建模能力的情况下是可行的。

0

相关内容

【ICML2025】用于持续多模态指令微调的动态课程化LoRA专家混合机制

【ICML2025】用于持续多模态指令微调的动态课程化LoRA专家混合机制

专知会员服务

12+阅读 · 6月17日

【CIKM2023】GiGaMAE: 通过协同潜在空间重建的可泛化图掩码自编码器

【CIKM2023】GiGaMAE: 通过协同潜在空间重建的可泛化图掩码自编码器

专知会员服务

23+阅读 · 2023年8月22日

【NeurIPS2021】模型可解释性的符号语言基础

专知会员服务

22+阅读 · 2021年10月8日

ICCV'21 Oral｜拒绝调参，显著提点！检测分割任务的新损失函数RS Loss开源

专知会员服务

16+阅读 · 2021年8月11日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知会员服务

28+阅读 · 2020年4月1日

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

专知

19+阅读 · 2021年3月28日

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

专知

18+阅读 · 2020年6月22日

【KDD2020】XGNN-可解释图神经网络，从模型级解释构建可信赖GNN

【KDD2020】XGNN-可解释图神经网络，从模型级解释构建可信赖GNN

专知

17+阅读 · 2020年6月7日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

基于对称识别方法的贝叶斯probit模型稳健性研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于格值逻辑的语言真值α-群锁语义归结自动推理研究

国家自然科学基金

0+阅读 · 2015年12月31日

模糊认知集群优化的聚类算法

国家自然科学基金

8+阅读 · 2015年12月31日

基于结构学习的非平行支持向量机最优化方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

相依重尾随机变量和的渐近性及其在更新风险模型中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

Verifiable Fine-Tuning for LLMs: Zero-Knowledge Training Proofs Bound to Data Provenance and Policy

Arxiv

0+阅读 · 12月29日

Bridging Semantics and Geometry: A Decoupled LVLM-SAM Framework for Reasoning Segmentation in Remote Sensing

Arxiv

0+阅读 · 12月22日

D$^{2}$Stream: Decoupled Dual-Stream Temporal-Speaker Interaction for Audio-Visual Speaker Detection

Arxiv

0+阅读 · 12月22日

Context-Aware Initialization for Reducing Generative Path Length in Diffusion Language Models

Arxiv

0+阅读 · 12月22日

SGCR: A Specification-Grounded Framework for Trustworthy LLM Code Review

Arxiv

0+阅读 · 12月19日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2025】用于持续多模态指令微调的动态课程化LoRA专家混合机制

【ICML2025】用于持续多模态指令微调的动态课程化LoRA专家混合机制

专知会员服务

12+阅读 · 6月17日

【CIKM2023】GiGaMAE: 通过协同潜在空间重建的可泛化图掩码自编码器

【CIKM2023】GiGaMAE: 通过协同潜在空间重建的可泛化图掩码自编码器

专知会员服务

23+阅读 · 2023年8月22日

【NeurIPS2021】模型可解释性的符号语言基础

专知会员服务

22+阅读 · 2021年10月8日

ICCV'21 Oral｜拒绝调参，显著提点！检测分割任务的新损失函数RS Loss开源

专知会员服务

16+阅读 · 2021年8月11日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知会员服务

28+阅读 · 2020年4月1日

热门VIP内容

开通专知VIP会员享更多权益服务

《北约联合仿真与集成、验证与鉴定服务标准》2025最新40页

《面向协同任务的无人地面车辆与无人机（UGV-UAV）集成研究综述》2025最新综述论文

《理解大语言模型在军事战术任务规划中的局限性》

《国防与安全会议论文集》最新80页

相关资讯

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

专知

19+阅读 · 2021年3月28日

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

【MIT】最优传输图神经网络，Optimal Transport Graph Neural Networks

专知

18+阅读 · 2020年6月22日

【KDD2020】XGNN-可解释图神经网络，从模型级解释构建可信赖GNN

【KDD2020】XGNN-可解释图神经网络，从模型级解释构建可信赖GNN

专知

17+阅读 · 2020年6月7日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

相关论文

Verifiable Fine-Tuning for LLMs: Zero-Knowledge Training Proofs Bound to Data Provenance and Policy

Arxiv

0+阅读 · 12月29日

Bridging Semantics and Geometry: A Decoupled LVLM-SAM Framework for Reasoning Segmentation in Remote Sensing

Arxiv

0+阅读 · 12月22日

D$^{2}$Stream: Decoupled Dual-Stream Temporal-Speaker Interaction for Audio-Visual Speaker Detection

Arxiv

0+阅读 · 12月22日

Context-Aware Initialization for Reducing Generative Path Length in Diffusion Language Models

Arxiv

0+阅读 · 12月22日

SGCR: A Specification-Grounded Framework for Trustworthy LLM Code Review

Arxiv

0+阅读 · 12月19日

相关基金

基于对称识别方法的贝叶斯probit模型稳健性研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于格值逻辑的语言真值α-群锁语义归结自动推理研究

国家自然科学基金

0+阅读 · 2015年12月31日

模糊认知集群优化的聚类算法

国家自然科学基金

8+阅读 · 2015年12月31日

基于结构学习的非平行支持向量机最优化方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

相依重尾随机变量和的渐近性及其在更新风险模型中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员