Few/Zero-shot learning is a big challenge of many classifications tasks, where a classifier is required to recognise instances of classes that have very few or even no training samples. It becomes more difficult in multi-label classification, where each instance is labelled with more than one class. In this paper, we present a simple multi-graph aggregation model that fuses knowledge from multiple label graphs encoding different semantic label relationships in order to study how the aggregated knowledge can benefit multi-label zero/few-shot document classification. The model utilises three kinds of semantic information, i.e., the pre-trained word embeddings, label description, and pre-defined label relations. Experimental results derived on two large clinical datasets (i.e., MIMIC-II and MIMIC-III) and the EU legislation dataset show that methods equipped with the multi-graph knowledge aggregation achieve significant performance improvement across almost all the measures on few/zero-shot labels.
翻译:少见/零点学习是许多分类任务的一大挑战,要求分类员识别只有很少甚至没有训练样品的班级,在多标签分类中,每个例都有不止一个类的标签,这更加困难。在本文中,我们提出了一个简单的多图集模型,将从多标签图中获得的知识结合在一起,将不同的语义标签关系编码起来,以便研究综合知识如何有利于多标签零/毛片文件分类。模型使用三种语义信息,即预先训练的字嵌入、标签描述和预先界定的标签关系。从两个大型临床数据集(即MIMIC-II和MIMIC-III)和欧盟立法数据集得出的实验结果显示,配备多图表知识集的方法在几乎所有关于少数/零点标签的措施中都取得了显著的绩效改进。