Multi-label image recognition is a fundamental yet practical task because real-world images inherently possess multiple semantic labels. However, it is difficult to collect large-scale multi-label annotations due to the complexity of both the input images and output label spaces. To reduce the annotation cost, we propose a structured semantic transfer (SST) framework that enables training multi-label recognition models with partial labels, i.e., merely some labels are known while other labels are missing (also called unknown labels) per image. The framework consists of two complementary transfer modules that explore within-image and cross-image semantic correlations to transfer knowledge of known labels to generate pseudo labels for unknown labels. Specifically, an intra-image semantic transfer module learns image-specific label co-occurrence matrix and maps the known labels to complement unknown labels based on this matrix. Meanwhile, a cross-image transfer module learns category-specific feature similarities and helps complement unknown labels with high similarities. Finally, both known and generated labels are used to train the multi-label recognition models. Extensive experiments on the Microsoft COCO, Visual Genome and Pascal VOC datasets show that the proposed SST framework obtains superior performance over current state-of-the-art algorithms. Codes are available at https://github.com/HCPLab-SYSU/HCP-MLR-PL.
翻译:多标签图像识别是一项基本但实际的任务,因为真实世界图像本身就拥有多种语义标签。 然而,由于输入图像和输出标签空间的复杂性,很难收集大型多标签说明。 为了降低批注成本,我们提议了一个结构化语义传输框架,以便能够培训多标签识别模型,并配有部分标签,即,在每张图像缺少其他标签时,仅知道一些标签,而其他标签则缺少(也称为未知标签)其他标签。这个框架由两个互补的传输模块组成,这些模块在图像和跨图像语义相关关系中探索已知标签知识,以生成未知标签的假标签。具体地说,一个图像内部语义传输模块学习特定图像标签共相交矩阵,并绘制已知标签,以补充基于此矩阵的未知标签。同时,一个交叉图像传输模块学习特定特性相似性特征,帮助以高度相似性补充未知的标签。最后,已知和生成的标签都用于在多标签识别模型中培训多标签识别模式。具体意义上的语义语义转移模块中,一个内部语义传输模块学习特定标签,用于当前S- MS- ASB- 的SLAVCSAL 的高级测试系统测试系统测试系统。在当前的数据库上进行广泛的测试中,通过SB-SBIS-C-SBSBSB-C-C-C-C-C-SLSLSLSBS-S-S-SAS-SBS-S-S-S-SV-SV-SAS-S-S-S-SLSVS-S-S-S-S-SLSLSLSLSBSBSBS-S-S-SBSAS-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SAL-SAL-SAR-SBAR-SAR-S-S-S-SAR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-