转让文字嵌入学习的群分矩阵要素化 (Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings)

Sparse regression has recently been applied to enable transfer learning from very limited data. We study an extension of this approach to unsupervised learning -- in particular, learning word embeddings from unstructured text corpora using low-rank matrix factorization. Intuitively, when transferring word embeddings to a new domain, we expect that the embeddings change for only a small number of words -- e.g., the ones with novel meanings in that domain. We propose a novel group-sparse penalty that exploits this sparsity to perform transfer learning when there is very little text data available in the target domain -- e.g., a single article of text. We prove generalization bounds for our algorithm. Furthermore, we empirically evaluate its effectiveness, both in terms of prediction accuracy in downstream tasks as well as the interpretability of the results.

翻译：最近,为了能够从非常有限的数据中进行转移学习,我们采用了粗略的回归法。我们研究了这一方法的延伸,将其推广到不受监督的学习中,特别是利用低级矩阵因子化,从结构化的文本公司中学习文字嵌入。在将文字嵌入到新领域时,我们直觉地认为,嵌入的词只改变少数几个词,例如,在这方面具有新含义的词。我们提议了一个新颖的集团粗略惩罚,在目标领域几乎没有可用文字数据时,利用这种模糊性来进行转移学习,例如,文本的单一一篇文章。我们证明了我们算法的概括性界限。此外,我们从预测下游任务的准确性以及结果的可解释性的角度对它的有效性进行了实验性评估。

相关内容

词向量表示

关注 37

分散式表示即将语言表示为稠密、低维、连续的向量。研究者最早发现学习得到词嵌入之间存在类比关系。比如apple−apples ≈ car−cars， man−woman ≈ king – queen 等。这些方法都可以直接在大规模无标注语料上进行训练。词嵌入的质量也非常依赖于上下文窗口大小的选择。通常大的上下文窗口学到的词嵌入更反映主题信息，而小的上下文窗口学到的词嵌入更反映词的功能和上下文语义信息。

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

专知会员服务

39+阅读 · 2020年11月3日