Active learning improves annotation efficiency by selecting the most informative samples for annotation and model training. While most prior work has focused on selecting informative images for classification tasks, we investigate the more challenging setting of dense prediction, where annotations are more costly and time-intensive, especially in medical imaging. Region-level annotation has been shown to be more efficient than image-level annotation for these tasks. However, existing methods for representative annotation region selection suffer from high computational and memory costs, irrelevant region choices, and heavy reliance on uncertainty sampling. We propose decomposition sampling (DECOMP), a new active learning sampling strategy that addresses these limitations. It enhances annotation diversity by decomposing images into class-specific components using pseudo-labels and sampling regions from each class. Class-wise predictive confidence further guides the sampling process, ensuring that difficult classes receive additional annotations. Across ROI classification, 2-D segmentation, and 3-D segmentation, DECOMP consistently surpasses baseline methods by better sampling minority-class regions and boosting performance on these challenging classes. Code is in https://github.com/JingnaQiu/DECOMP.git.
翻译:主动学习通过选择信息量最大的样本进行标注和模型训练,从而提高标注效率。尽管先前的研究大多集中于为分类任务选择信息丰富的图像,但本文探讨了更具挑战性的密集预测场景,其中标注成本更高、耗时更长,尤其在医学影像领域。对于此类任务,区域级标注已被证明比图像级标注更高效。然而,现有代表性标注区域选择方法存在计算和内存成本高、区域选择不相关以及过度依赖不确定性采样的局限性。我们提出了分解采样(DECOMP),一种新的主动学习采样策略,以应对这些限制。该方法利用伪标签将图像分解为类别特定的组件,并从每个类别中采样区域,从而增强标注的多样性。类别预测置信度进一步指导采样过程,确保困难类别获得更多标注。在ROI分类、二维分割和三维分割任务中,DECOMP通过更好地采样少数类别区域并提升这些困难类别的性能,持续超越基线方法。代码位于 https://github.com/JingnaQiu/DECOMP.git。