用于生成文本的深中可变模型 (Deep Latent-Variable Models for Text Generation)

Text generation aims to produce human-like natural language output for down-stream tasks. It covers a wide range of applications like machine translation, document summarization, dialogue generation and so on. Recently deep neural network-based end-to-end architectures have been widely adopted. The end-to-end approach conflates all sub-modules, which used to be designed by complex handcrafted rules, into a holistic encode-decode architecture. Given enough training data, it is able to achieve state-of-the-art performance yet avoiding the need of language/domain-dependent knowledge. Nonetheless, deep learning models are known to be extremely data-hungry, and text generated from them usually suffer from low diversity, interpretability and controllability. As a result, it is difficult to trust the output from them in real-life applications. Deep latent-variable models, by specifying the probabilistic distribution over an intermediate latent process, provide a potential way of addressing these problems while maintaining the expressive power of deep neural networks. This dissertation presents how deep latent-variable models can improve over the standard encoder-decoder model for text generation.

翻译：生成文本的目的是为下游任务生成像人类一样的自然语言输出。它覆盖了机器翻译、文档总和、对话生成等范围广泛的应用。最近, 广泛采用了基于神经网络的深神经网络端到端结构。端到端方法将所有子模块组合成一个整体的编码解码结构。有了足够的培训数据, 它能够实现最先进的性能, 同时又避免语言/ 依赖性知识的需求。尽管如此, 深层次的学习模型是极为饥饿的, 从这些模型生成的文本通常具有低多样性、可解释性和可控性。因此, 很难在现实应用中相信这些小模块的输出。深潜变量模型, 具体指明了中间潜伏过程的概率分布, 提供了解决这些问题的潜在方法, 同时保持了深层神经网络的直观能力。这种分解表明深深潜可变模型能够如何改进标准化的编码模型生成。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/