Low-shot object counting addresses estimating the number of previously unobserved objects in an image using only few or no annotated test-time exemplars. A considerable challenge for modern low-shot counters are dense regions with small objects. While total counts in such situations are typically well addressed by density-based counters, their usefulness is limited by poor localization capabilities. This is better addressed by point-detection-based counters, which are based on query-based detectors. However, due to limited number of pre-trained queries, they underperform on images with very large numbers of objects, and resort to ad-hoc techniques like upsampling and tiling. We propose CoDi, the first latent diffusion-based low-shot counter that produces high-quality density maps on which object locations can be determined by non-maxima suppression. Our core contribution is the new exemplar-based conditioning module that extracts and adjusts the object prototypes to the intermediate layers of the denoising network, leading to accurate object location estimation. On FSC benchmark, CoDi outperforms state-of-the-art by 15% MAE, 13% MAE and 10% MAE in the few-shot, one-shot, and reference-less scenarios, respectively, and sets a new state-of-the-art on MCAC benchmark by outperforming the top method by 44% MAE. The code is available at https://github.com/gsustar/CoDi.
翻译:低样本目标计数旨在仅使用少量或无标注测试范例来估计图像中先前未观测目标的数量。现代低样本计数器面临的一个重大挑战是包含小目标的密集区域。尽管基于密度的计数器通常能较好地处理此类情况下的总计数,但其定位能力较差限制了其实用性。基于点检测的计数器(基于查询检测器)能更好地解决此问题。然而,由于预训练查询数量有限,它们在目标数量极大的图像上表现不佳,并需采用如上采样和分块等临时技术。我们提出了CoDi,首个基于潜在扩散的低样本计数器,可生成高质量密度图,并通过非极大值抑制确定目标位置。我们的核心贡献是新型基于范例的条件化模块,该模块提取目标原型并将其调整至去噪网络的中间层,从而实现精确的目标位置估计。在FSC基准测试中,CoDi在少样本、单样本和无参考场景下的平均绝对误差分别比现有最优方法降低15%、13%和10%,并在MCAC基准测试中以44%的平均绝对误差优势超越最优方法,创造了新的性能记录。代码发布于https://github.com/gsustar/CoDi。