基于Wasserstein损失的深分布序列嵌入 (Deep Distributional Sequence Embeddings Based on a Wasserstein Loss)

Deep metric learning employs deep neural networks to embed instances into a metric space such that distances between instances of the same class are small and distances between instances from different classes are large. In most existing deep metric learning techniques, the embedding of an instance is given by a feature vector produced by a deep neural network and Euclidean distance or cosine similarity defines distances between these vectors. In this paper, we study deep distributional embeddings of sequences, where the embedding of a sequence is given by the distribution of learned deep features across the sequence. This has the advantage of capturing statistical information about the distribution of patterns within the sequence in the embedding. When embeddings are distributions rather than vectors, measuring distances between embeddings involves comparing their respective distributions. We propose a distance metric based on Wasserstein distances between the distributions and a corresponding loss function for metric learning, which leads to a novel end-to-end trainable embedding model. We empirically observe that distributional embeddings outperform standard vector embeddings and that training with the proposed Wasserstein metric outperforms training with other distance functions.

翻译：深度测量学习利用深度神经网络将各种实例嵌入一个测量空间,使同一类的事例之间的距离小,不同类的事例之间的距离大。在大多数现有的深度测量学习技术中,一个实例的嵌入是由深层神经网络产生的特性矢量产生的,而Euclidean距离或相近性则界定了这些矢量之间的距离。在本文中,我们研究了序列的深度分布嵌入,序列的嵌入顺序是通过在整个序列中所学深度特征的分布所决定的。这具有获取关于嵌入序列中模式分布的统计信息的好处。当嵌入是分布而不是矢量时,测量嵌入之间的距离需要比较它们各自的分布。我们根据瓦塞斯坦分布之间的距离和相应的损失函数提出了一个远程测量尺度,从而导致一个新的端到端可训练的嵌入模型。我们从经验上看到,分布嵌入超出了标准的矢量嵌入,并且用拟议的瓦西斯坦矩阵外形外形训练与其他远程功能的培训。

相关内容

度量学习

关注 3354

度量学习的目的为了衡量样本之间的相近程度，而这也正是模式识别的核心问题之一。大量的机器学习方法，比如K近邻、支持向量机、径向基函数网络等分类方法以及K-means聚类方法，还有一些基于图的方法，其性能好坏都主要有样本之间的相似度量方法的选择决定。度量学习通常的目标是使同类样本之间的距离尽可能缩小，不同类样本之间的距离尽可能放大。

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

105+阅读 · 2020年6月10日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

51+阅读 · 2020年6月1日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

21+阅读 · 2020年4月21日

【CVPR2020-Oral-浙江大学】深度知识迁移的深度归因图，DEPARA: Deep Attribution Graph

专知会员服务

26+阅读 · 2020年3月19日