视视视洋流成波:从视视视洋流和声源图像进行环境健全合成 (Visual onoma-to-wave: environmental sound synthesis from visual onomatopoeias and sound-source images) - 专知论文

会员服务 ·

0

回合 · 表示 · 多样性 · INFORMS · Performer ·

2022 年 10 月 17 日

Visual onoma-to-wave: environmental sound synthesis from visual onomatopoeias and sound-source images

翻译：视视视洋流成波:从视视视洋流和声源图像进行环境健全合成

Hien Ohnaka,Shinnosuke Takamichi,Keisuke Imoto,Yuki Okamoto,Kazuki Fujii,Hiroshi Saruwatari

from arxiv, Submitted to ICASSP 2023

We propose a method for synthesizing environmental sounds from visually represented onomatopoeias and sound sources. An onomatopoeia is a word that imitates a sound structure, i.e., the text representation of sound. From this perspective, onoma-to-wave has been proposed to synthesize environmental sounds from the desired onomatopoeia texts. Onomatopoeias have another representation: visual-text representations of sounds in comics, advertisements, and virtual reality. A visual onomatopoeia (visual text of onomatopoeia) contains rich information that is not present in the text, such as a long-short duration of the image, so the use of this representation is expected to synthesize diverse sounds. Therefore, we propose visual onoma-to-wave for environmental sound synthesis from visual onomatopoeia. The method can transfer visual concepts of the visual text and sound-source image to the synthesized sound. We also propose a data augmentation method focusing on the repetition of onomatopoeias to enhance the performance of our method. An experimental evaluation shows that the methods can synthesize diverse environmental sounds from visual text and sound-source images.

翻译：我们建议了一种方法,将视觉代表的奥多莫托科伊亚和声源中的环境声音合成为一体。一种显眼的奥多莫伊亚是一个模仿声音结构的词, 即声音的文字表达方式。从这个角度, 已经提议了通过对波合成来自理想的奥多托奥伊亚文本的环境声音。 Onotooeia 还有一个表达方式: 漫画、广告和虚拟现实中声音的视觉- 文本表达方式。一个视觉的奥多托奥伊亚( Onototooiea) 包含在文本中不存在的丰富信息, 例如图像的长短时间, 因此使用这种表达方式可以合成多种声音。因此, 我们提议了从视觉的奥多托奥伊亚语文本和声源图像中进行环境声音合成的视觉- 。该方法可以将视觉文本和声源图像的视觉概念转移到合成声音中。我们还提议一种数据增强能力的方法, 侧重于对奥托波亚的重复来增强我们的方法的性能。实验性评估显示, 方法可以综合从视觉文本和声源中产生的不同环境声音。

0

相关内容

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

DMRTA1/RAGE调控肝脏胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

Survivin在低氧诱导喉癌淋巴管生成中的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

二维g-C3N4基纳米异质结的构建及光催化还原CO2性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Diversin介导非小细胞肺癌长春瑞滨耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于视觉密码学的moiré纹度量和构建研究

国家自然科学基金

0+阅读 · 2012年12月31日

介孔材料-氧化石墨烯复合纳米粒子协同增强改性不饱和聚酯复合材料的研究

国家自然科学基金

0+阅读 · 2012年12月31日

航空用蠕变时效成形高强铝合金疲劳特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

水溶性REF3 (RE = Y,Gd)-KF体系上转换发光纳米材料合成及其生物应用

国家自然科学基金

0+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

PrivacyProber: Assessment and Detection of Soft-Biometric Privacy-Enhancing Techniques

Arxiv

0+阅读 · 2022年11月22日

FakeSV: A Multimodal Benchmark with Rich Social Context for Fake News Detection on Short Video Platforms

Arxiv

0+阅读 · 2022年11月20日

Normalizing Flows for Human Pose Anomaly Detection

Arxiv

0+阅读 · 2022年11月20日

SolderNet: Towards Trustworthy Visual Inspection of Solder Joints in Electronics Manufacturing Using Explainable Artificial Intelligence

Arxiv

0+阅读 · 2022年11月18日

Rare Yet Popular: Evidence and Implications from Labeled Datasets for Network Anomaly Detection

Arxiv

0+阅读 · 2022年11月18日

Pedestrian Spatio-Temporal Information Fusion For Video Anomaly Detection

Arxiv

0+阅读 · 2022年11月18日

Multi-VQG: Generating Engaging Questions for Multiple Images

Arxiv

0+阅读 · 2022年11月18日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Arxiv

14+阅读 · 2019年9月17日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】移动计算摄影的神经场表示

大语言模型遇见法律人工智能：综述

【ICCV2025】InfGen：一种分辨率无关的可扩展图像合成范式

美军用无人地面战车发展：现代战争中超越弹药的多元应用

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

PrivacyProber: Assessment and Detection of Soft-Biometric Privacy-Enhancing Techniques

Arxiv

0+阅读 · 2022年11月22日

FakeSV: A Multimodal Benchmark with Rich Social Context for Fake News Detection on Short Video Platforms

Arxiv

0+阅读 · 2022年11月20日

Normalizing Flows for Human Pose Anomaly Detection

Arxiv

0+阅读 · 2022年11月20日

SolderNet: Towards Trustworthy Visual Inspection of Solder Joints in Electronics Manufacturing Using Explainable Artificial Intelligence

Arxiv

0+阅读 · 2022年11月18日

Rare Yet Popular: Evidence and Implications from Labeled Datasets for Network Anomaly Detection

Arxiv

0+阅读 · 2022年11月18日

Pedestrian Spatio-Temporal Information Fusion For Video Anomaly Detection

Arxiv

0+阅读 · 2022年11月18日

Multi-VQG: Generating Engaging Questions for Multiple Images

Arxiv

0+阅读 · 2022年11月18日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Arxiv

14+阅读 · 2019年9月17日

相关基金

DMRTA1/RAGE调控肝脏胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

Survivin在低氧诱导喉癌淋巴管生成中的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

二维g-C3N4基纳米异质结的构建及光催化还原CO2性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Diversin介导非小细胞肺癌长春瑞滨耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于视觉密码学的moiré纹度量和构建研究

国家自然科学基金

0+阅读 · 2012年12月31日

介孔材料-氧化石墨烯复合纳米粒子协同增强改性不饱和聚酯复合材料的研究

国家自然科学基金

0+阅读 · 2012年12月31日

航空用蠕变时效成形高强铝合金疲劳特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

水溶性REF3 (RE = Y,Gd)-KF体系上转换发光纳米材料合成及其生物应用

国家自然科学基金

0+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员