CBOW并非你所需要的全部:将CBOW与构成矩阵空间模型结合起来 (CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model) - 专知论文

会员服务 ·

0

连续词袋模型 · Continuity · Performer · Better · MoDELS ·

2019 年 2 月 18 日

CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model

翻译：CBOW并非你所需要的全部:将CBOW与构成矩阵空间模型结合起来

Florian Mai,Lukas Galke,Ansgar Scherp

from arxiv, Conference paper at ICLR 2019

Continuous Bag of Words (CBOW) is a powerful text embedding method. Due to its strong capabilities to encode word content, CBOW embeddings perform well on a wide range of downstream tasks while being efficient to compute. However, CBOW is not capable of capturing the word order. The reason is that the computation of CBOW's word embeddings is commutative, i.e., embeddings of XYZ and ZYX are the same. In order to address this shortcoming, we propose a learning algorithm for the Continuous Matrix Space Model, which we call Continual Multiplication of Words (CMOW). Our algorithm is an adaptation of word2vec, so that it can be trained on large quantities of unlabeled text. We empirically show that CMOW better captures linguistic properties, but it is inferior to CBOW in memorizing word content. Motivated by these findings, we propose a hybrid model that combines the strengths of CBOW and CMOW. Our results show that the hybrid CBOW-CMOW-model retains CBOW's strong ability to memorize word content while at the same time substantially improving its ability to encode other linguistic information by 8%. As a result, the hybrid also performs better on 8 out of 11 supervised downstream tasks with an average improvement of 1.2%.

翻译：连续的 Word Bag (CBOW) 是一种强大的嵌入文本方法。由于 CBOW 的嵌入功能对字内容进行编码的强大能力, CBOW 的嵌入功能在一系列广泛的下游任务上表现良好, 同时能够高效地进行计算。但是, CBOW 无法捕捉字顺序。原因是计算 CBOW 的字嵌入是通俗的, 也就是说, 嵌入 XYZ 和 ZYX 是相同的。为了解决这一缺陷, 我们为连续的矩阵空间模型( 我们称之为 CMOW ) 提议了一个学习算法。我们的算法是调换Word2vec, 这样它就可以在大量无标签的文本上接受培训。我们从经验上表明 CBOW 更好地捕捉到语言内容的特性, 但是由于这些发现, CBOW 和 CMOW 的优势, 我们提议了一个混合模型, 结合CBOW 空间模型的优势。我们的计算结果显示, 混合的 CBOW- CM- 建模模型在改进了它的平均8 能力上, 也保留了它的平均版本改进了自己在8 格式上改进了它的平均的的。

1

相关内容

连续词袋模型

连续词袋模型

连续词袋模型（CBOW），利用上下文或周围的单词来预测中心词。其输入为某一个特征词的上下文相关对应的词向量（单词的one-hot编码）；输出为这特定的一个词的词向量(单词的one-hot编码）。

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

105+阅读 · 2020年6月10日

Query2box: 使用盒嵌入对向量空间中的知识图谱进行推理，Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings

专知会员服务

45+阅读 · 2020年5月11日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

42+阅读 · 2020年4月22日

所有好的向量空间都是同构的吗?Are All Good Word Vector Spaces Isomorphic?

所有好的向量空间都是同构的吗?Are All Good Word Vector Spaces Isomorphic?

专知会员服务

8+阅读 · 2020年4月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

专知会员服务

39+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

57+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

144+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

26+阅读 · 2019年1月4日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

基于 Doc2vec 训练句子向量

基于 Doc2vec 训练句子向量

AI研习社

6+阅读 · 2018年5月16日

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

专知

29+阅读 · 2018年3月12日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

论文共读 | Attention is All You Need

论文共读 | Attention is All You Need

黑龙江大学自然语言处理实验室

14+阅读 · 2017年9月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Visualizing and Measuring the Geometry of BERT

Visualizing and Measuring the Geometry of BERT

Arxiv

7+阅读 · 2019年10月28日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

Attention Is (not) All You Need for Commonsense Reasoning

Arxiv

7+阅读 · 2019年5月31日

You May Not Need Attention

Arxiv

4+阅读 · 2018年10月31日

Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder

Arxiv

4+阅读 · 2018年5月24日

Enhancing Network Embedding with Auxiliary Information: An Explicit Matrix Factorization Perspective

Arxiv

3+阅读 · 2018年3月5日

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Arxiv

3+阅读 · 2018年1月30日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

Knowledge-based Word Sense Disambiguation using Topic Models

Arxiv

5+阅读 · 2018年1月5日

Attention Is All You Need

Arxiv

25+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

连续词袋模型

相关VIP内容

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

105+阅读 · 2020年6月10日

Query2box: 使用盒嵌入对向量空间中的知识图谱进行推理，Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings

专知会员服务

45+阅读 · 2020年5月11日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

42+阅读 · 2020年4月22日

所有好的向量空间都是同构的吗?Are All Good Word Vector Spaces Isomorphic?

所有好的向量空间都是同构的吗?Are All Good Word Vector Spaces Isomorphic?

专知会员服务

8+阅读 · 2020年4月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

专知会员服务

39+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

57+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

144+阅读 · 2019年10月12日

热门VIP内容

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

26+阅读 · 2019年1月4日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

基于 Doc2vec 训练句子向量

基于 Doc2vec 训练句子向量

AI研习社

6+阅读 · 2018年5月16日

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

专知

29+阅读 · 2018年3月12日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

论文共读 | Attention is All You Need

论文共读 | Attention is All You Need

黑龙江大学自然语言处理实验室

14+阅读 · 2017年9月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Visualizing and Measuring the Geometry of BERT

Visualizing and Measuring the Geometry of BERT

Arxiv

7+阅读 · 2019年10月28日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

Attention Is (not) All You Need for Commonsense Reasoning

Arxiv

7+阅读 · 2019年5月31日

You May Not Need Attention

Arxiv

4+阅读 · 2018年10月31日

Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder

Arxiv

4+阅读 · 2018年5月24日

Enhancing Network Embedding with Auxiliary Information: An Explicit Matrix Factorization Perspective

Arxiv

3+阅读 · 2018年3月5日

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Arxiv

3+阅读 · 2018年1月30日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

Knowledge-based Word Sense Disambiguation using Topic Models

Arxiv

5+阅读 · 2018年1月5日

Attention Is All You Need

Arxiv

25+阅读 · 2017年12月6日

微信扫码咨询专知VIP会员