使用SAM无法分割任何东西吗？--SAM-Adapter：在伪装、阴影等性能下降场景中调整SAM (SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, and More) - 专知论文

会员服务 ·

0

分割 · 性能下降 · 阴影检测 · 物体检测 · 图像分割 ·

2023 年 4 月 18 日

SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, and More

翻译：使用SAM无法分割任何东西吗？--SAM-Adapter：在伪装、阴影等性能下降场景中调整SAM

Tianrun Chen,Lanyun Zhu,Chaotao Ding,Runlong Cao,Shangzhan Zhang,Yan Wang,Zejian Li,Lingyun Sun,Papa Mao,Ying Zang

The emergence of large models, also known as foundation models, has brought significant advancements to AI research. One such model is Segment Anything (SAM), which is designed for image segmentation tasks. However, as with other foundation models, our experimental findings suggest that SAM may fail or perform poorly in certain segmentation tasks, such as shadow detection and camouflaged object detection (concealed object detection). This study first paves the way for applying the large pre-trained image segmentation model SAM to these downstream tasks, even in situations where SAM performs poorly. Rather than fine-tuning the SAM network, we propose \textbf{SAM-Adapter}, which incorporates domain-specific information or visual prompts into the segmentation network by using simple yet effective adapters. Our extensive experiments show that SAM-Adapter can significantly elevate the performance of SAM in challenging tasks and we can even outperform task-specific network models and achieve state-of-the-art performance in the task we tested: camouflaged object detection and shadow detection. We believe our work opens up opportunities for utilizing SAM in downstream tasks, with potential applications in various fields, including medical image processing, agriculture, remote sensing, and more.

翻译：---- 大型模型（也称为基础模型）的出现为AI研究带来了巨大的进步。其中之一是用于图像分割任务的Segment Anything（SAM）模型。然而，与其他基础模型一样，我们的实验发现SAM在某些分割任务中可能会失败或表现不佳，例如阴影检测和伪装物体检测（隐蔽物体检测）。本研究首先为将大型预训练的图像分割模型SAM应用于这些下游任务铺平了道路，即使在SAM表现不佳的情况下。我们提出了SAM-Adapter，而非微调SAM网络，通过使用简单但有效的适配器将领域特定信息或视觉提示纳入分割网络中。我们广泛的实验表明，SAM-Adapter可以显著提高SAM在具有挑战性的任务中的性能，我们甚至可以超越任务特定的网络模型，在我们测试的任务中取得最先进的性能：伪装物体检测和阴影检测。我们相信我们的工作为在下游任务中利用SAM开辟了机会，具有医学图像处理、农业、遥感等各种领域的潜在应用。

1

相关内容

小目标如何检测？西工大韩军伟等发布《大规模小目标检测》综述，20页pdf全面阐述小目标检测方法和自动驾驶与空中场景基准数据集

小目标如何检测？西工大韩军伟等发布《大规模小目标检测》综述，20页pdf全面阐述小目标检测方法和自动驾驶与空中场景基准数据集

专知会员服务

93+阅读 · 2022年7月29日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

专知会员服务

14+阅读 · 2022年3月12日

【ICCV2021】一张草图训练可控的GAN？CMU朱俊彦团队

专知会员服务

22+阅读 · 2021年8月10日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

专知会员服务

28+阅读 · 2019年12月27日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

通用视觉GPT时刻来临？智源推出通用分割模型SegGPT

通用视觉GPT时刻来临？智源推出通用分割模型SegGPT

机器之心

4+阅读 · 2023年4月8日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【泡泡一分钟】优化对比度增强以提高SLAM重定位环境中视觉跟踪的稳健性

【泡泡一分钟】优化对比度增强以提高SLAM重定位环境中视觉跟踪的稳健性

泡泡机器人SLAM

10+阅读 · 2019年4月26日

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

泡泡机器人SLAM

11+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

基于自学习对比度视觉注意模型和自适应深度特征的无分类目标检测

国家自然科学基金

2+阅读 · 2015年12月31日

用隐身方法对散射介质后物体非侵入式成像

国家自然科学基金

0+阅读 · 2015年12月31日

复杂战场环境下的几个目标跟踪新问题研究

国家自然科学基金

60+阅读 · 2014年12月31日

神经元凋亡时GSK-3/Egr-1上调PUMA的作用及其机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于超顺磁聚类和图割的复杂红外成像目标自动检测方法

国家自然科学基金

0+阅读 · 2013年12月31日

红外弱小高机动目标实时检测与跟踪方法研究

国家自然科学基金

4+阅读 · 2012年12月31日

近似计数的算法与复杂性

国家自然科学基金

1+阅读 · 2012年12月31日

基于压缩感知机理的EEG信号癫痫波形自动检测与识别方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于目标的注意模型及在图像分割和目标检测中的应用

国家自然科学基金

1+阅读 · 2009年12月31日

多Agent对抗环境中联盟形成问题的研究

国家自然科学基金

8+阅读 · 2008年12月31日

Segment Anything in High Quality

Arxiv

0+阅读 · 2023年6月2日

Transformer-Based Visual Segmentation: A Survey

Arxiv

0+阅读 · 2023年6月2日

ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation

Arxiv

0+阅读 · 2023年6月1日

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear

Arxiv

0+阅读 · 2023年6月1日

Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA

Arxiv

0+阅读 · 2023年5月31日

LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities

Arxiv

21+阅读 · 2023年5月22日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Trustworthy AI: From Principles to Practices

Arxiv

46+阅读 · 2021年10月4日

Self-supervised Learning: Generative or Contrastive

Arxiv

25+阅读 · 2021年3月20日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

VIP会员

文章信息

相关主题

相关VIP内容

小目标如何检测？西工大韩军伟等发布《大规模小目标检测》综述，20页pdf全面阐述小目标检测方法和自动驾驶与空中场景基准数据集

小目标如何检测？西工大韩军伟等发布《大规模小目标检测》综述，20页pdf全面阐述小目标检测方法和自动驾驶与空中场景基准数据集

专知会员服务

93+阅读 · 2022年7月29日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

专知会员服务

14+阅读 · 2022年3月12日

【ICCV2021】一张草图训练可控的GAN？CMU朱俊彦团队

专知会员服务

22+阅读 · 2021年8月10日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

专知会员服务

28+阅读 · 2019年12月27日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

通用视觉GPT时刻来临？智源推出通用分割模型SegGPT

通用视觉GPT时刻来临？智源推出通用分割模型SegGPT

机器之心

4+阅读 · 2023年4月8日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【泡泡一分钟】优化对比度增强以提高SLAM重定位环境中视觉跟踪的稳健性

【泡泡一分钟】优化对比度增强以提高SLAM重定位环境中视觉跟踪的稳健性

泡泡机器人SLAM

10+阅读 · 2019年4月26日

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

【泡泡一分钟】用于评估视觉惯性里程计的TUM VI数据集

泡泡机器人SLAM

11+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

Segment Anything in High Quality

Arxiv

0+阅读 · 2023年6月2日

Transformer-Based Visual Segmentation: A Survey

Arxiv

0+阅读 · 2023年6月2日

ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation

Arxiv

0+阅读 · 2023年6月1日

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear

Arxiv

0+阅读 · 2023年6月1日

Unveiling Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA

Arxiv

0+阅读 · 2023年5月31日

LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities

Arxiv

21+阅读 · 2023年5月22日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Trustworthy AI: From Principles to Practices

Arxiv

46+阅读 · 2021年10月4日

Self-supervised Learning: Generative or Contrastive

Arxiv

25+阅读 · 2021年3月20日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

相关基金

基于自学习对比度视觉注意模型和自适应深度特征的无分类目标检测

国家自然科学基金

2+阅读 · 2015年12月31日

用隐身方法对散射介质后物体非侵入式成像

国家自然科学基金

0+阅读 · 2015年12月31日

复杂战场环境下的几个目标跟踪新问题研究

国家自然科学基金

60+阅读 · 2014年12月31日

神经元凋亡时GSK-3/Egr-1上调PUMA的作用及其机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于超顺磁聚类和图割的复杂红外成像目标自动检测方法

国家自然科学基金

0+阅读 · 2013年12月31日

红外弱小高机动目标实时检测与跟踪方法研究

国家自然科学基金

4+阅读 · 2012年12月31日

近似计数的算法与复杂性

国家自然科学基金

1+阅读 · 2012年12月31日

基于压缩感知机理的EEG信号癫痫波形自动检测与识别方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于目标的注意模型及在图像分割和目标检测中的应用

国家自然科学基金

1+阅读 · 2009年12月31日

多Agent对抗环境中联盟形成问题的研究

国家自然科学基金

8+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员