Given a collection of bags where each bag is a set of images, our goal is to select one image from each bag such that the selected images are from the same object class. We model the selection as an energy minimization problem with unary and pairwise potential functions. Inspired by recent few-shot learning algorithms, we propose an approach to learn the potential functions directly from the data. Furthermore, we propose a fast greedy inference algorithm for energy minimization. We evaluate our approach on few-shot common object recognition as well as object co-localization tasks. Our experiments show that learning the pairwise and unary terms greatly improves the performance of the model over several well-known methods for these tasks. The proposed greedy optimization algorithm achieves performance comparable to state-of-the-art structured inference algorithms while being ~10 times faster. The code is publicly available on https://github.com/haamoon/finding_common_object.
翻译:给定一个包含多个包的集合,其中每个包是一组图像,我们的目标是从每个包中选择一幅图像,使得所选图像属于同一对象类别。我们将选择过程建模为一个具有一元和成对势函数的能量最小化问题。受近期少样本学习算法的启发,我们提出一种直接从数据中学习势函数的方法。此外,我们为能量最小化提出了一种快速的贪心推理算法。我们在少样本共同对象识别以及对象共定位任务上评估了我们的方法。实验表明,学习成对项和一元项显著提升了模型在这些任务上相对于多种已知方法的性能。所提出的贪心优化算法在性能上可与最先进的结构化推理算法相媲美,同时速度提升约10倍。代码已在 https://github.com/haamoon/finding_common_object 公开提供。