DiffEdit: 带有掩码指导的基于传播的语义图像编辑 (DiffEdit: Diffusion-based semantic image editing with mask guidance) - 专知论文

会员服务 ·

0

Guidance · 掩码 · MoDELS · contrastive · Extensibility ·

2022 年 10 月 20 日

DiffEdit: Diffusion-based semantic image editing with mask guidance

翻译：DiffEdit: 带有掩码指导的基于传播的语义图像编辑

Guillaume Couairon,Jakob Verbeek,Holger Schwenk,Matthieu Cord

from arxiv, Preprint

Image generation has recently seen tremendous advances, with diffusion models allowing to synthesize convincing images for a large variety of text prompts. In this article, we propose DiffEdit, a method to take advantage of text-conditioned diffusion models for the task of semantic image editing, where the goal is to edit an image based on a text query. Semantic image editing is an extension of image generation, with the additional constraint that the generated image should be as similar as possible to a given input image. Current editing methods based on diffusion models usually require to provide a mask, making the task much easier by treating it as a conditional inpainting task. In contrast, our main contribution is able to automatically generate a mask highlighting regions of the input image that need to be edited, by contrasting predictions of a diffusion model conditioned on different text prompts. Moreover, we rely on latent inference to preserve content in those regions of interest and show excellent synergies with mask-based diffusion. DiffEdit achieves state-of-the-art editing performance on ImageNet. In addition, we evaluate semantic image editing in more challenging settings, using images from the COCO dataset as well as text-based generated images.

翻译：图像生成最近出现了巨大的进步, 其传播模型允许将令人信服的图像合成为各种文本提示。在文章中, 我们提议 DiffEdit, 这是一种利用文本附加条件的传播模型来完成语义图像编辑任务的方法, 目的是根据文字查询编辑图像。语义图像编辑是图像生成的延伸, 额外的限制是生成的图像应该尽可能与特定输入图像相似。目前基于传播模型的编辑方法通常需要提供遮罩, 通过将任务作为有条件的油漆任务来让任务更容易处理。相比之下, 我们的主要贡献能够自动生成一个遮罩, 突出需要编辑的输入图像区域, 对比以不同文本提示为条件的传播模型的预测。此外, 我们依靠潜在推论来保存这些感兴趣的区域的内容, 并显示基于遮罩的传播的极好协同作用。 DiffEdit 通常需要在图像网络上提供最先进的编辑性表现。此外, 我们用基于COs生成的图像作为文本的图像, 来评估在更具挑战性的环境中进行语义的图像编辑。

0

相关内容

Guidance

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

粗糙回归模型与算法研究

国家自然科学基金

8+阅读 · 2015年12月31日

李超代数中若干问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

带等级约束的半在线调度问题模型与算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

非局部均值图像去噪算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

HIV的非结构蛋白对ABC转运蛋白作用的研究

国家自然科学基金

0+阅读 · 2011年12月31日

双螺旋DNA结构中规则的碱基对间氢键的进一步研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于sEMG和FES的下肢康复机器人生物反馈控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

知识驱动的多目标决策数据挖掘理论框架及应用实验系统研究

国家自然科学基金

1+阅读 · 2009年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

多文种文档图像识别的多层次马尔可夫随机场模型研究

国家自然科学基金

1+阅读 · 2008年12月31日

LatentSwap3D: Semantic Edits on 3D Image GANs

Arxiv

0+阅读 · 2022年12月2日

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

Arxiv

0+阅读 · 2022年12月2日

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Arxiv

0+阅读 · 2022年12月2日

Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs

Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs

Arxiv

0+阅读 · 2022年12月1日

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

Arxiv

11+阅读 · 2022年12月1日

On Distillation of Guided Diffusion Models

Arxiv

1+阅读 · 2022年11月30日

High-Fidelity Guided Image Synthesis with Latent Diffusion Models

Arxiv

0+阅读 · 2022年11月30日

Compositional GAN: Learning Conditional Image Composition

Compositional GAN: Learning Conditional Image Composition

Arxiv

31+阅读 · 2018年7月19日

Image Captioning

Arxiv

11+阅读 · 2018年5月13日

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

Arxiv

16+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《解析陆域作战方向：一个概念性框架》报告

《人工智能与人类的未来》2025年最新300页书籍

追寻真正的AI自主性：从遗留思维到战场优势

《“蛛网”行动：乌克兰不对称作战的演进》报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

LatentSwap3D: Semantic Edits on 3D Image GANs

Arxiv

0+阅读 · 2022年12月2日

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

Arxiv

0+阅读 · 2022年12月2日

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

Arxiv

0+阅读 · 2022年12月2日

Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs

Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs

Arxiv

0+阅读 · 2022年12月1日

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

Arxiv

11+阅读 · 2022年12月1日

On Distillation of Guided Diffusion Models

Arxiv

1+阅读 · 2022年11月30日

High-Fidelity Guided Image Synthesis with Latent Diffusion Models

Arxiv

0+阅读 · 2022年11月30日

Compositional GAN: Learning Conditional Image Composition

Compositional GAN: Learning Conditional Image Composition

Arxiv

31+阅读 · 2018年7月19日

Image Captioning

Arxiv

11+阅读 · 2018年5月13日

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

Arxiv

16+阅读 · 2018年1月30日

相关基金

粗糙回归模型与算法研究

国家自然科学基金

8+阅读 · 2015年12月31日

李超代数中若干问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

带等级约束的半在线调度问题模型与算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

非局部均值图像去噪算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

HIV的非结构蛋白对ABC转运蛋白作用的研究

国家自然科学基金

0+阅读 · 2011年12月31日

双螺旋DNA结构中规则的碱基对间氢键的进一步研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于sEMG和FES的下肢康复机器人生物反馈控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

知识驱动的多目标决策数据挖掘理论框架及应用实验系统研究

国家自然科学基金

1+阅读 · 2009年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

多文种文档图像识别的多层次马尔可夫随机场模型研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员