GNN-LM:通过GNNN建立以全球背景为基础的语言建模 (GNN-LM: Language Modeling based on Global Contexts via GNN) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 图 · Perplexity · 有向 ·

2022 年 1 月 21 日

GNN-LM: Language Modeling based on Global Contexts via GNN

翻译：GNN-LM:通过GNNN建立以全球背景为基础的语言建模

Yuxian Meng,Shi Zong,Xiaoya Li,Xiaofei Sun,Tianwei Zhang,Fei Wu,Jiwei Li

from arxiv, To appear at ICLR 2022

Inspired by the notion that {\it to copy is easier than to memorize}, in this work, we introduce GNN-LM, which extends the vanilla neural language model (LM) by allowing to reference similar contexts in the entire training corpus. We build a directed heterogeneous graph between an input context and its semantically related neighbors selected from the training corpus, where nodes are tokens in the input context and retrieved neighbor contexts, and edges represent connections between nodes. Graph neural networks (GNNs) are constructed upon the graph to aggregate information from similar contexts to decode the token. This learning paradigm provides direct access to the reference contexts and helps improve a model's generalization ability. We conduct comprehensive experiments to validate the effectiveness of the GNN-LM: GNN-LM achieves a new state-of-the-art perplexity of 14.8 on WikiText-103 (a 4.5 point improvement over its counterpart of the vanilla LM model) and shows substantial improvement on One Billion Word and Enwiki8 datasets against strong baselines. In-depth ablation studies are performed to understand the mechanics of GNN-LM. The code can be found at \url{https://github.com/ShannonAI/GNN-LM}

翻译：在这项工作中,我们引入了GNN-LM,通过允许在整个培训材料中参考类似背景,扩展了香草神经语言模型(LM),从而扩展了香草神经语言模型(LM),我们在整个培训材料中建立了输入背景与其从培训材料中挑选的与语义相关的邻居之间的定向混合图,在输入背景中,节点是符号,取回相邻背景,边缘代表节点之间的连接。图形神经网络(GNNNs)建在图表上,将类似背景的信息汇总起来,以解码符号。这种学习模式直接提供参考背景,并帮助提高模型的概括化能力。我们开展了全面实验,以验证GNN-LM的有效性:GNN-LM在WikitText-103上实现了14.8的新的状态-艺术性混乱(比Vanilla LM模型的对应方值改进4.5个百分点),并展示了“一亿维Word和Enwiki8数据集与强基线之间的重大改进。深入的amburMnal研究是理解GNNS/NM的机械。

4

相关内容

语言模型化

语言模型化

【ICLR2022】GNN-LM基于全局信息的图神经网络语义理解模型

【ICLR2022】GNN-LM基于全局信息的图神经网络语义理解模型

专知会员服务

21+阅读 · 2022年2月12日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

近期必读的五篇 IJCAI 2020【图神经网络 (GNN)+NLP 】相关论文

近期必读的五篇 IJCAI 2020【图神经网络 (GNN)+NLP 】相关论文

专知会员服务

76+阅读 · 2020年8月18日

近期必读的五篇顶会 ACL 2020【图神经网络 (GNN) 】相关论文

近期必读的五篇顶会 ACL 2020【图神经网络 (GNN) 】相关论文

专知会员服务

105+阅读 · 2020年6月9日

近期必读的五篇顶会ACL 2020【图神经网络 (GNN) 】相关论文

近期必读的五篇顶会ACL 2020【图神经网络 (GNN) 】相关论文

专知会员服务

81+阅读 · 2020年5月5日

六篇 EMNLP 2019【图神经网络(GNN)+NLP】相关论文

六篇 EMNLP 2019【图神经网络(GNN)+NLP】相关论文

专知会员服务

72+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

KDD2021 | 最新GNN官方教程

KDD2021 | 最新GNN官方教程

机器学习与推荐算法

2+阅读 · 2021年8月18日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

中国地区气象要素的多时空信息特征研究

国家自然科学基金

0+阅读 · 2013年12月31日

Gab在中枢神经系统髓鞘发育与髓鞘损伤修复中的功能及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

太平洋固氮生物的多样性和生物固氮粒级结构研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

对偶框架各向异性提升变换理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

内质网应激在砷致神经细胞毒性中的作用机制及干预研究

国家自然科学基金

0+阅读 · 2012年12月31日

细胞团显微成像与分析用高分辨率X-CT研究

国家自然科学基金

0+阅读 · 2011年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于Treelet变换的多时相SAR图像变化检测

国家自然科学基金

0+阅读 · 2009年12月31日

中国对虾IGF-ⅠIGF-Ⅱ#22522;因序列及其多态性与生长性状的关联分析

国家自然科学基金

0+阅读 · 2009年12月31日

Attention in Attention: Modeling Context Correlation for Efficient Video Classification

Arxiv

0+阅读 · 2022年4月20日

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals

Arxiv

0+阅读 · 2022年4月16日

Spatio-Temporal-Frequency Graph Attention Convolutional Network for Aircraft Recognition Based on Heterogeneous Radar Network

Spatio-Temporal-Frequency Graph Attention Convolutional Network for Aircraft Recognition Based on Heterogeneous Radar Network

Arxiv

0+阅读 · 2022年4月15日

LaMemo: Language Modeling with Look-Ahead Memory

Arxiv

0+阅读 · 2022年4月15日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

Heterogeneous Graph Transformer

Heterogeneous Graph Transformer

Arxiv

27+阅读 · 2020年3月3日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Link Prediction Based on Graph Neural Networks

Arxiv

26+阅读 · 2018年2月27日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【ICLR2022】GNN-LM基于全局信息的图神经网络语义理解模型

【ICLR2022】GNN-LM基于全局信息的图神经网络语义理解模型

专知会员服务

21+阅读 · 2022年2月12日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

近期必读的五篇 IJCAI 2020【图神经网络 (GNN)+NLP 】相关论文

近期必读的五篇 IJCAI 2020【图神经网络 (GNN)+NLP 】相关论文

专知会员服务

76+阅读 · 2020年8月18日

近期必读的五篇顶会 ACL 2020【图神经网络 (GNN) 】相关论文

近期必读的五篇顶会 ACL 2020【图神经网络 (GNN) 】相关论文

专知会员服务

105+阅读 · 2020年6月9日

近期必读的五篇顶会ACL 2020【图神经网络 (GNN) 】相关论文

近期必读的五篇顶会ACL 2020【图神经网络 (GNN) 】相关论文

专知会员服务

81+阅读 · 2020年5月5日

六篇 EMNLP 2019【图神经网络(GNN)+NLP】相关论文

六篇 EMNLP 2019【图神经网络(GNN)+NLP】相关论文

专知会员服务

72+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

126页ppt《AI应用（AI Agent）开发新范式》！

基于深度神经网络的视频分析中的效率优化技术综述：处理系统、算法与应用

WWW2025 | KAG：一种大模型知识增强生成框架

用于时间序列预测的扩散模型：综述

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

KDD2021 | 最新GNN官方教程

KDD2021 | 最新GNN官方教程

机器学习与推荐算法

2+阅读 · 2021年8月18日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Attention in Attention: Modeling Context Correlation for Efficient Video Classification

Arxiv

0+阅读 · 2022年4月20日

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals

Arxiv

0+阅读 · 2022年4月16日

Spatio-Temporal-Frequency Graph Attention Convolutional Network for Aircraft Recognition Based on Heterogeneous Radar Network

Spatio-Temporal-Frequency Graph Attention Convolutional Network for Aircraft Recognition Based on Heterogeneous Radar Network

Arxiv

0+阅读 · 2022年4月15日

LaMemo: Language Modeling with Look-Ahead Memory

Arxiv

0+阅读 · 2022年4月15日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

Heterogeneous Graph Transformer

Heterogeneous Graph Transformer

Arxiv

27+阅读 · 2020年3月3日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Link Prediction Based on Graph Neural Networks

Arxiv

26+阅读 · 2018年2月27日

相关基金

中国地区气象要素的多时空信息特征研究

国家自然科学基金

0+阅读 · 2013年12月31日

Gab在中枢神经系统髓鞘发育与髓鞘损伤修复中的功能及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

太平洋固氮生物的多样性和生物固氮粒级结构研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

对偶框架各向异性提升变换理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

内质网应激在砷致神经细胞毒性中的作用机制及干预研究

国家自然科学基金

0+阅读 · 2012年12月31日

细胞团显微成像与分析用高分辨率X-CT研究

国家自然科学基金

0+阅读 · 2011年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于Treelet变换的多时相SAR图像变化检测

国家自然科学基金

0+阅读 · 2009年12月31日

中国对虾IGF-ⅠIGF-Ⅱ#22522;因序列及其多态性与生长性状的关联分析

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员