OTSeq2Set: 优化运输增强序列到极端多标签文本分类的强化序列模型 (OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for Extreme Multi-label Text Classification)

Extreme multi-label text classification (XMTC) is the task of finding the most relevant subset labels from an extremely large-scale label collection. Recently, some deep learning models have achieved state-of-the-art results in XMTC tasks. These models commonly predict scores for all labels by a fully connected layer as the last layer of the model. However, such models can't predict a relatively complete and variable-length label subset for each document, because they select positive labels relevant to the document by a fixed threshold or take top k labels in descending order of scores. A less popular type of deep learning models called sequence-to-sequence (Seq2Seq) focus on predicting variable-length positive labels in sequence style. However, the labels in XMTC tasks are essentially an unordered set rather than an ordered sequence, the default order of labels restrains Seq2Seq models in training. To address this limitation in Seq2Seq, we propose an autoregressive sequence-to-set model for XMTC tasks named OTSeq2Set. Our model generates predictions in student-forcing scheme and is trained by a loss function based on bipartite matching which enables permutation-invariance. Meanwhile, we use the optimal transport distance as a measurement to force the model to focus on the closest labels in semantic label space. Experiments show that OTSeq2Set outperforms other competitive baselines on 4 benchmark datasets. Especially, on the Wikipedia dataset with 31k labels, it outperforms the state-of-the-art Seq2Seq method by 16.34% in micro-F1 score. The code is available at https://github.com/caojie54/OTSeq2Set.

翻译：极端多标签文本分类 (XMTC) 是从极大型标签收集中找到最相关的子标签的任务。最近, 一些深层次学习模型在 XMTC 任务中实现了最先进的结果。这些模型通常通过完全连接的层来预测所有标签的分数, 这是模型的最后一层。然而, 这些模型无法预测每个文档的相对完整和可变长标签子, 因为他们选择了一个固定的阈值, 或者以降分顺序选择与文档相关的正面标签。一种不太受欢迎的深层次学习模型类型, 叫做序列到序列( Seq2Seq2Seq), 重点是在序列样式中预测变长的正值。然而, XMTC 任务中的标签基本上是一个未排序的设置, 而不是一个订购的顺序。然而, 这些标签的默认顺序限制了每个文档的 seq2Seq2Sequareet, 我们提议在名为 ATSeqelex2Setrealal- develrial lax lex-deal deal laisal ladeal lax lax lader Sal- sqour lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax ro lax lax rod rod rods rods lax lax lax lax rod rod rod rod rods lax rod rods lax rods rods lax rods rogres max rodds rogres rogres rogres rods rods rods rods rods rods rods roddddddal rodal rods ro ro ro rod rod ro ro rod rods rods rods rod lad ro ro ro ro ro ro ro ro ro ro

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日