图像生成段落主题的革命性自动编码 (Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation) - 专知论文

会员服务 ·

0

收缩自编码器 · 卷积 · Extensibility · 话题模型 · 长短期记忆网络 ·

2019 年 8 月 1 日

Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation

翻译：图像生成段落主题的革命性自动编码

Jing Wang,Yingwei Pan,Ting Yao,Jinhui Tang,Tao Mei

from arxiv, IJCAI 2019

Image paragraph generation is the task of producing a coherent story (usually a paragraph) that describes the visual content of an image. The problem nevertheless is not trivial especially when there are multiple descriptive and diverse gists to be considered for paragraph generation, which often happens in real images. A valid question is how to encapsulate such gists/topics that are worthy of mention from an image, and then describe the image from one topic to another but holistically with a coherent structure. In this paper, we present a new design --- Convolutional Auto-Encoding (CAE) that purely employs convolutional and deconvolutional auto-encoding framework for topic modeling on the region-level features of an image. Furthermore, we propose an architecture, namely CAE plus Long Short-Term Memory (dubbed as CAE-LSTM), that novelly integrates the learnt topics in support of paragraph generation. Technically, CAE-LSTM capitalizes on a two-level LSTM-based paragraph generation framework with attention mechanism. The paragraph-level LSTM captures the inter-sentence dependency in a paragraph, while sentence-level LSTM is to generate one sentence which is conditioned on each learnt topic. Extensive experiments are conducted on Stanford image paragraph dataset, and superior results are reported when comparing to state-of-the-art approaches. More remarkably, CAE-LSTM increases CIDEr performance from 20.93% to 25.15%.

翻译：图像段落生成的任务是制作一个一致的故事(通常是一段),描述图像的视觉内容。然而,问题并不是微不足道的,特别是在为生成段落时需要考虑多个描述性和多样性的多面性格,这往往发生在真实的图像中。一个有效的问题是如何从图像中包含值得一提的、从一个主题到另一个主题,然后用一个连贯的结构整体地描述图像。在本文件中,我们提出了一个新的设计 -- -- 革命自动编码(CAE),它纯粹使用在图像的区域级特征上进行主题建模的革命性和非革命性自动编码框架。此外,我们提出了一个结构,即CAE+长期内存(作为CAE-LSTM的缩放式),它以新颖的方式将所学过的专题纳入支持生成的段落。技术上,CAE-LTM利用基于LSTM的双层段落生成机制。LSTM在段落中记录了相互依赖性关系,而DS-93级LS-CA-CA-C-CA-CL-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-CS-CS-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-S-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-I-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-

1

相关内容

收缩自编码器

收缩自编码器

收缩自编码器(contractive autoencoder，CAE)是一种正则自编码器。它在编码h=f(x)的基础上添加了显示的正则项，鼓励 f 的导数尽可能小，它的惩罚项Ω(h) 是平方Frobenius范数（元素平方的和），作用于与编码器的函数相关偏导数的Jacobian矩阵。

【KDD2020】图神经网络生成式预训练，GPT-GNN: Generative Pre-Training of Graph Neural Networks

【KDD2020】图神经网络生成式预训练，GPT-GNN: Generative Pre-Training of Graph Neural Networks

专知会员服务

97+阅读 · 2020年7月3日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

95+阅读 · 2020年5月31日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

45+阅读 · 2020年4月8日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

49+阅读 · 2020年2月26日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

337+阅读 · 2020年1月27日

【AAAI2020-清华大学】张量图卷积网络文本分类，Tensor Graph Convolutional Networks for Text Classification

【AAAI2020-清华大学】张量图卷积网络文本分类，Tensor Graph Convolutional Networks for Text Classification

专知会员服务

75+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

52+阅读 · 2019年9月29日

内涵网络嵌入：Content-rich Network Embedding

内涵网络嵌入：Content-rich Network Embedding

我爱读PAMI

4+阅读 · 2019年11月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

10+阅读 · 2017年11月12日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Active Generative Adversarial Network for Image Classification

Arxiv

4+阅读 · 2019年6月17日

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Arxiv

7+阅读 · 2019年6月14日

Generative Adversarial Networks and Conditional Random Fields for Hyperspectral Image Classification

Arxiv

3+阅读 · 2019年5月12日

Graph Convolutional Networks for Text Classification

Arxiv

11+阅读 · 2018年10月17日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

14+阅读 · 2018年9月19日

Improving Neural Question Generation using Answer Separation

Improving Neural Question Generation using Answer Separation

Arxiv

3+阅读 · 2018年9月7日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

GraphRNN: A Deep Generative Model for Graphs

Arxiv

6+阅读 · 2018年2月24日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

CNN-RNN: A Unified Framework for Multi-label Image Classification

Arxiv

7+阅读 · 2016年4月15日

VIP会员

文章信息

相关主题

收缩自编码器

长短期记忆网络

相关VIP内容

【KDD2020】图神经网络生成式预训练，GPT-GNN: Generative Pre-Training of Graph Neural Networks

【KDD2020】图神经网络生成式预训练，GPT-GNN: Generative Pre-Training of Graph Neural Networks

专知会员服务

97+阅读 · 2020年7月3日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

95+阅读 · 2020年5月31日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

45+阅读 · 2020年4月8日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

49+阅读 · 2020年2月26日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

337+阅读 · 2020年1月27日

【AAAI2020-清华大学】张量图卷积网络文本分类，Tensor Graph Convolutional Networks for Text Classification

【AAAI2020-清华大学】张量图卷积网络文本分类，Tensor Graph Convolutional Networks for Text Classification

专知会员服务

75+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

52+阅读 · 2019年9月29日

热门VIP内容

相关资讯

内涵网络嵌入：Content-rich Network Embedding

内涵网络嵌入：Content-rich Network Embedding

我爱读PAMI

4+阅读 · 2019年11月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

10+阅读 · 2017年11月12日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Active Generative Adversarial Network for Image Classification

Arxiv

4+阅读 · 2019年6月17日

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Arxiv

7+阅读 · 2019年6月14日

Generative Adversarial Networks and Conditional Random Fields for Hyperspectral Image Classification

Arxiv

3+阅读 · 2019年5月12日

Graph Convolutional Networks for Text Classification

Arxiv

11+阅读 · 2018年10月17日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

14+阅读 · 2018年9月19日

Improving Neural Question Generation using Answer Separation

Improving Neural Question Generation using Answer Separation

Arxiv

3+阅读 · 2018年9月7日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

GraphRNN: A Deep Generative Model for Graphs

Arxiv

6+阅读 · 2018年2月24日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

CNN-RNN: A Unified Framework for Multi-label Image Classification

Arxiv

7+阅读 · 2016年4月15日

微信扫码咨询专知VIP会员