Compared with the conventional hand-crafted approaches, the deep learning based methods have achieved tremendous performance improvements by training exquisitely crafted fancy networks over large-scale training sets. However, do we really need large-scale training set for salient object detection (SOD)? In this paper, we provide a deeper insight into the interrelationship between the SOD performances and the training sets. To alleviate the conventional demands for large-scale training data, we provide a feasible way to construct a novel small-scale training set, which only contains 4K images. Moreover, we propose a novel bi-stream network to take full advantage of our proposed small training set, which is consisted of two feature backbones with different structures, achieving complementary semantical saliency fusion via the proposed gate control unit. To our best knowledge, this is the first attempt to use a small-scale training set to outperform state-of-the-art models which are trained on large-scale training sets; nevertheless, our method can still achieve the leading state-of-the-art performance on five benchmark datasets.
翻译:与传统手工艺方法相比,深层次的学习方法通过在大型培训组别中培训精巧设计的高级网络,取得了巨大的绩效改进。然而,我们真的需要为显要物体探测(SOD)设置大规模培训吗?在本文件中,我们更深入地了解了SPOD性能与培训组别之间的相互关系。为了缓解对大规模培训数据的传统需求,我们提供了一个可行的方法来构建一个新的小规模培训组别,仅包含4K幅图象。此外,我们提议建立一个新型双流网络,以充分利用我们提议的小型培训组别,即由两个具有不同结构的功能骨干组成的小型培训组,通过拟议的门控股实现互补的语义显著聚合。 据我们所知,这是首次尝试使用小型培训组别,以超越在大规模培训组别上培训的先进模式;然而,我们的方法仍然可以在五个基准数据集上取得领先的状态表现。