Deep metric learning employs deep neural networks to embed instances into a metric space such that distances between instances of the same class are small and distances between instances from different classes are large. In most existing deep metric learning techniques, the embedding of an instance is given by a feature vector produced by a deep neural network and Euclidean distance or cosine similarity defines distances between these vectors. In this paper, we study deep distributional embeddings of sequences, where the embedding of a sequence is given by the distribution of learned deep features across the sequence. This has the advantage of capturing statistical information about the distribution of patterns within the sequence in the embedding. When embeddings are distributions rather than vectors, measuring distances between embeddings involves comparing their respective distributions. We propose a distance metric based on Wasserstein distances between the distributions and a corresponding loss function for metric learning, which leads to a novel end-to-end trainable embedding model. We empirically observe that distributional embeddings outperform standard vector embeddings and that training with the proposed Wasserstein metric outperforms training with other distance functions.
翻译:深度测量学习利用深度神经网络将各种实例嵌入一个测量空间,使同一类的事例之间的距离小,不同类的事例之间的距离大。在大多数现有的深度测量学习技术中,一个实例的嵌入是由深层神经网络产生的特性矢量产生的,而Euclidean距离或相近性则界定了这些矢量之间的距离。在本文中,我们研究了序列的深度分布嵌入,序列的嵌入顺序是通过在整个序列中所学深度特征的分布所决定的。这具有获取关于嵌入序列中模式分布的统计信息的好处。当嵌入是分布而不是矢量时,测量嵌入之间的距离需要比较它们各自的分布。我们根据瓦塞斯坦分布之间的距离和相应的损失函数提出了一个远程测量尺度,从而导致一个新的端到端可训练的嵌入模型。我们从经验上看到,分布嵌入超出了标准的矢量嵌入,并且用拟议的瓦西斯坦矩阵外形外形训练与其他远程功能的培训。