Text generation aims to produce human-like natural language output for down-stream tasks. It covers a wide range of applications like machine translation, document summarization, dialogue generation and so on. Recently deep neural network-based end-to-end architectures have been widely adopted. The end-to-end approach conflates all sub-modules, which used to be designed by complex handcrafted rules, into a holistic encode-decode architecture. Given enough training data, it is able to achieve state-of-the-art performance yet avoiding the need of language/domain-dependent knowledge. Nonetheless, deep learning models are known to be extremely data-hungry, and text generated from them usually suffer from low diversity, interpretability and controllability. As a result, it is difficult to trust the output from them in real-life applications. Deep latent-variable models, by specifying the probabilistic distribution over an intermediate latent process, provide a potential way of addressing these problems while maintaining the expressive power of deep neural networks. This dissertation presents how deep latent-variable models can improve over the standard encoder-decoder model for text generation.
翻译:生成文本的目的是为下游任务生成像人类一样的自然语言输出。 它覆盖了机器翻译、 文档总和、 对话生成等范围广泛的应用。 最近, 广泛采用了基于神经网络的深神经网络端到端结构。 端到端方法将所有子模块组合成一个整体的编码解码结构。 有了足够的培训数据, 它能够实现最先进的性能, 同时又避免语言/ 依赖性知识的需求。 尽管如此, 深层次的学习模型是极为饥饿的, 从这些模型生成的文本通常具有低多样性、 可解释性和可控性。 因此, 很难在现实应用中相信这些小模块的输出。 深潜变量模型, 具体指明了中间潜伏过程的概率分布, 提供了解决这些问题的潜在方法, 同时保持了深层神经网络的直观能力。 这种分解表明深深潜可变模型能够如何改进标准化的编码模型生成。