以视觉为基础的语音信号提醒名词注意的模型:英语和日语双语实验 (Models of Visually Grounded Speech Signal Pay Attention To Nouns: a Bilingual Experiment on English and Japanese)

We investigate the behaviour of attention in neural models of visually grounded speech trained on two languages: English and Japanese. Experimental results show that attention focuses on nouns and this behaviour holds true for two very typologically different languages. We also draw parallels between artificial neural attention and human attention and show that neural attention focuses on word endings as it has been theorised for human attention. Finally, we investigate how two visually grounded monolingual models can be used to perform cross-lingual speech-to-speech retrieval. For both languages, the enriched bilingual (speech-image) corpora with part-of-speech tags and forced alignments are distributed to the community for reproducible research.

翻译：我们调查了两种语言:英语和日语培训的视觉辅助语言神经模型中的注意行为。实验结果表明,注意力集中在名词上,这种行为在两种非常典型的不同语言中是有道理的。我们还把人工神经注意力和人类注意力相提并论,并表明神经注意力集中在文字结尾上,因为它是人类注意力的理论理论。最后,我们调查了两种视觉单一语言模型如何用来进行跨语言语音检索。对于两种语言来说,丰富的双语(语音图像)公司以及部分语音标记和强制校正都分发给社区,以便进行可复制的研究。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

19+阅读 · 2020年6月4日

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

168+阅读 · 2020年5月6日

所有跨语言嵌入式都应该讲英语吗? | Should All Cross-Lingual Embeddings Speak English?

专知会员服务

6+阅读 · 2020年4月16日

简明扼要！Python教程手册，206页pdf

专知会员服务

46+阅读 · 2020年3月24日