声音到想象:利用深重网络架构进行不受监督的跨模式翻译 (Sound-to-Imagination: Unsupervised Crossmodal Translation Using Deep Dense Network Architecture) - 专知论文

会员服务 ·

0

Performer · GANs · 无监督 · Networking · 多模态 ·

2021 年 6 月 2 日

Sound-to-Imagination: Unsupervised Crossmodal Translation Using Deep Dense Network Architecture

翻译：声音到想象:利用深重网络架构进行不受监督的跨模式翻译

Leonardo A. Fanzeres,Climent Nadeu

The motivation of our research is to develop a sound-to-image (S2I) translation system for enabling a human receiver to visually infer the occurrence of sound related events. We expect the computer to 'imagine' the scene from the captured sound, generating original images that picture the sound emitting source. Previous studies on similar topics opted for simplified approaches using data with low content diversity and/or strong supervision. Differently, we propose to perform unsupervised S2I translation using thousands of distinct and unknown scenes, with slightly pre-cleaned data, just enough to guarantee aural-visual semantic coherence. To that end, we employ conditional generative adversarial networks (GANs) with a deep densely connected generator. Besides, we implemented a moving-average adversarial loss to address GANs training instability. Though the specified S2I translation problem is quite challenging, we were able to generalize the translator model enough to obtain more than 14%, in average, of interpretable and semantically coherent images translated from unknown sounds. Additionally, we present a solution using informativity classifiers to perform quantitative evaluation of S2I translation.

翻译：我们研究的动机是开发一个声到图像(S2I)翻译系统,使人体接收器能够对声相关事件的发生进行视觉推断。我们期望计算机能够从所捕听的音响中“想象”场景,生成原始图像,从而描绘出音源。以往关于类似专题的研究选择了使用内容多样性低和/或监管强的数据的简化方法。不同地,我们提议使用数千个不同和未知的场景进行不受监督的S2I翻译,并略微清理前的数据,足以保证视听语义的一致性。为此,我们使用一个具有深度密集连接发电机的有条件的配对网。此外,我们实施了移动平均对抗性网络,以解决GANs的不稳定性。尽管指定的S2I翻译问题相当棘手,但我们能够将翻译模型普遍化,以获得平均超过14%的、可翻译和语义一致的、由未知声音翻译的图像。此外,我们提出一种使用信息化分类师对S2I翻译进行定量评估的解决方案。

0

相关内容

Performer

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

21+阅读 · 2021年6月3日

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

19+阅读 · 2021年4月21日

最新《生成式对抗网络》简介，25页ppt

最新《生成式对抗网络》简介，25页ppt

专知会员服务

168+阅读 · 2020年6月28日

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

专知会员服务

83+阅读 · 2020年6月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

86+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

35+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

31+阅读 · 2019年4月5日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

8+阅读 · 2017年11月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

PaperWeekly

5+阅读 · 2017年7月19日

Image-to-Image Translation with Low Resolution Conditioning

Arxiv

0+阅读 · 2021年7月23日

A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification

Arxiv

5+阅读 · 2021年4月1日

Small-Object Detection in Remote Sensing Images with End-to-End Edge-Enhanced GAN and Object Detector Network

Small-Object Detection in Remote Sensing Images with End-to-End Edge-Enhanced GAN and Object Detector Network

Arxiv

3+阅读 · 2020年4月14日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Generative Adversarial Network Architectures For Image Synthesis Using Capsule Networks

Generative Adversarial Network Architectures For Image Synthesis Using Capsule Networks

Arxiv

3+阅读 · 2018年11月20日

Generative Dual Adversarial Network for Generalized Zero-shot Learning

Arxiv

7+阅读 · 2018年11月12日

Unsupervised Neural Machine Translation with Weight Sharing

Arxiv

6+阅读 · 2018年4月24日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

Denoising Adversarial Autoencoders

Arxiv

9+阅读 · 2018年1月4日

VIP会员

文章信息

相关主题

相关VIP内容

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

21+阅读 · 2021年6月3日

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

19+阅读 · 2021年4月21日

最新《生成式对抗网络》简介，25页ppt

最新《生成式对抗网络》简介，25页ppt

专知会员服务

168+阅读 · 2020年6月28日

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

【DeepMind深度学习课程】序列循环神经网络，141页ppt，Sequences and Recurrent Network

专知会员服务

83+阅读 · 2020年6月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

86+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

35+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

热门VIP内容

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

31+阅读 · 2019年4月5日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

8+阅读 · 2017年11月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

Image to Image Translation Using GAN - Part 2 | 每周话题精选 #06

PaperWeekly

5+阅读 · 2017年7月19日

相关论文

Image-to-Image Translation with Low Resolution Conditioning

Arxiv

0+阅读 · 2021年7月23日

A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification

Arxiv

5+阅读 · 2021年4月1日

Small-Object Detection in Remote Sensing Images with End-to-End Edge-Enhanced GAN and Object Detector Network

Small-Object Detection in Remote Sensing Images with End-to-End Edge-Enhanced GAN and Object Detector Network

Arxiv

3+阅读 · 2020年4月14日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Generative Adversarial Network Architectures For Image Synthesis Using Capsule Networks

Generative Adversarial Network Architectures For Image Synthesis Using Capsule Networks

Arxiv

3+阅读 · 2018年11月20日

Generative Dual Adversarial Network for Generalized Zero-shot Learning

Arxiv

7+阅读 · 2018年11月12日

Unsupervised Neural Machine Translation with Weight Sharing

Arxiv

6+阅读 · 2018年4月24日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

Denoising Adversarial Autoencoders

Arxiv

9+阅读 · 2018年1月4日

微信扫码咨询专知VIP会员