In this work, we evaluate contrastive models for the task of image retrieval. We hypothesise that models that are learned to encode semantic similarity among instances via discriminative learning should perform well on the task of image retrieval, where relevancy is defined in terms of instances of the same object. Through our extensive evaluation, we find that representations from models trained using contrastive methods perform on-par with (and outperforms) a pre-trained supervised baseline trained on the ImageNet labels in retrieval tasks under various configurations. This is remarkable given that the contrastive models require no explicit supervision. Thus, we conclude that these models can be used to bootstrap base models to build more robust image retrieval engines.
翻译:在这项工作中,我们评价了图像检索任务的对比模型。我们假设,通过歧视性学习将各种实例的语义相似性编码为词义相似性的模型在图像检索任务中应该表现良好,因为根据同一对象的事例界定了相关性。通过我们的广泛评估,我们发现,使用对比方法培训的模型的表述方式与(和优于)在各种配置的检索任务中就图像网络标签培训的经过预先训练的受监督基线是平行的(和优于)的。鉴于对比模型不需要明确的监督,这一点非常显著。因此,我们得出结论,这些模型可以用来为基建更强的图像检索引擎而挖掘基底模型。