利用注意力机制来“动态”地生成不同连接的权重,这就是自注意力模型(Self-Attention Model). 注意力机制模仿了生物观察行为的内部过程,即一种将内部经验和外部感觉对齐从而增加部分区域的观察精细度的机制。注意力机制可以快速提取稀疏数据的重要特征,因而被广泛用于自然语言处理任务,特别是机器翻译。而自注意力机制是注意力机制的改进,其减少了对外部信息的依赖,更擅长捕捉数据或特征的内部相关性

VIP内容

弱监督目标检测(WSOD)已经成为一种仅使用图像级别的类别标签训练目标检测器的有效工具。然而,由于没有目标级标签,WSOD检测器容易检测出显著物体、聚杂物体和判别性物体部分上的标注框。此外,图像级别的类别标签不会强制对同一图像的不同变换进行一致的目标检测。针对上述问题,我们提出了一种针对WSOD的综合注意力自蒸馏(CASD)训练方法。为了平衡各目标实例之间的特征学习,CASD计算同一图像的多个变换和特征层聚合的综合注意力。为了加强对目标的一致空间监督,CASD对WSOD网络进行自蒸馏,通过对同一幅图像的多个变换和特征层同时逼近全面注意力。CASD在标准数据集上如PASCAL VOC 2007/2012和MS-COCO产生了最好的结果。

https://www.ri.cmu.edu/publications/comprehensive-attention-self-distillation-for-weakly-supervised-object-detection/

成为VIP会员查看完整内容
0
14

最新内容

Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of the target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention Networks, referred to as SiamAttn, by introducing a new Siamese attention mechanism that computes deformable self-attention and cross-attention. The self attention learns strong context information via spatial attention, and selectively emphasizes interdependent channel-wise features with channel attention. The cross-attention is capable of aggregating rich contextual inter-dependencies between the target template and the search image, providing an implicit manner to adaptively update the target template. In addition, we design a region refinement module that computes depth-wise cross correlations between the attentional features for more accurate tracking. We conduct experiments on six benchmarks, where our method achieves new state of-the-art results, outperforming the strong baseline, SiamRPN++ [24], by 0.464->0.537 and 0.415->0.470 EAO on VOT 2016 and 2018. Our code is available at: https://github.com/msight-tech/research-siamattn.

0
0
下载
预览

最新论文

Siamese-based trackers have achieved excellent performance on visual object tracking. However, the target template is not updated online, and the features of the target template and search image are computed independently in a Siamese architecture. In this paper, we propose Deformable Siamese Attention Networks, referred to as SiamAttn, by introducing a new Siamese attention mechanism that computes deformable self-attention and cross-attention. The self attention learns strong context information via spatial attention, and selectively emphasizes interdependent channel-wise features with channel attention. The cross-attention is capable of aggregating rich contextual inter-dependencies between the target template and the search image, providing an implicit manner to adaptively update the target template. In addition, we design a region refinement module that computes depth-wise cross correlations between the attentional features for more accurate tracking. We conduct experiments on six benchmarks, where our method achieves new state of-the-art results, outperforming the strong baseline, SiamRPN++ [24], by 0.464->0.537 and 0.415->0.470 EAO on VOT 2016 and 2018. Our code is available at: https://github.com/msight-tech/research-siamattn.

0
0
下载
预览
Top