Few-shot image classification aims at training a model by using only a few (e.g., 5 or even 1) examples of novel classes. The established way of doing so is to rely on a larger set of base data for either pre-training a model, or for training in a meta-learning context. Unfortunately, these approaches often suffer from overfitting since the models can easily memorize all of the novel samples. This paper mitigates this issue and proposes to leverage part of the base data by aligning the novel training instances to the closely related ones in the base training set. This expands the size of the effective novel training set by adding extra related base instances to the few novel ones, thereby allowing to train the entire network. Doing so limits overfitting and simultaneously strengthens the generalization capabilities of the network. We propose two associative alignment strategies: 1) a conditional adversarial alignment loss based on the Wasserstein distance; and 2) a metric-learning loss for minimizing the distance between related base samples and the centroid of novel instances in the feature space. Experiments on two standard datasets demonstrate that combining our centroid-based alignment loss results in absolute accuracy improvements of 4.4%, 1.2%, and 6.0% in 5-shot learning over the state of the art for object recognition, fine-grained classification, and cross-domain adaptation, respectively.
翻译:少见的图像分类旨在培训模型,只使用几个(例如,5个甚至1个)新型班级的范例来培训模型(例如,5个或1个),既定的方法是依靠一套更大的基础数据来培训模型,或者为培训一个模型进行预培训,或者为培训元化学习环境的培训。不幸的是,这些方法往往由于过于适应,因为模型可以很容易地将所有新样本都记住。本文减轻了这一问题,并提议通过使新培训实例与基础空间中密切相关的范例保持一致来利用部分基础数据。这扩大了有效新颖培训的规模,在少数新版本中添加了额外的相关基准实例,从而能够培训整个网络。作出这样的限制,并同时加强了网络的一般化能力。我们提出了两个关联性调整战略:1)基于瓦瑟斯坦距离的有条件的对抗性调整性调整损失;2)通过将相关基准样本与地貌空间中新案例的缩略图之间的距离最小化,从而利用部分基础数据。在两个标准数据集上进行的实验表明,将我们基于百分位的调整损失的结果结合起来,从而可以对整个网络进行培训。 这样做,从而扩大和同时加强网络的总体能力,同时加强网络的通用能力。我们提出了网络的精确能力能力。我们提出了两项目标的精确度,分别在4.4 %、 1.0 和精确度的精确度,并改进了5 和精确度的精确度,并进度,分别学习了对准度的精确度,对准度对准度对准度。