深强化学习聊天室 (A Deep Reinforcement Learning Chatbot)

Iulian V. Serban,Chinnadhurai Sankar,Mathieu Germain,Saizheng Zhang,Zhouhan Lin,Sandeep Subramanian,Taesup Kim,Michael Pieper,Sarath Chandar,Nan Rosemary Ke,Sai Rajeshwar,Alexandre de Brebisson,Jose M. R. Sotelo,Dendi Suhubdy,Vincent Michalski,Alexandre Nguyen,Joelle Pineau,Yoshua Bengio

from arxiv, 40 pages, 9 figures, 11 tables

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including template-based models, bag-of-words models, sequence-to-sequence neural network and latent variable neural network models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than many competing systems. Due to its machine learning architecture, the system is likely to improve with additional data.

翻译：我们介绍MILABOT:蒙特利尔学习算术研究所(MILA)为亚马孙亚历山大奖竞赛开发的深入强化学习聊天室。MILABOT能够通过言语和文字与人交流流行的小话题,该系统由一系列自然语言生成和检索模型组成,包括基于模板的模型、字包模型、从序列到序列的神经网络和潜在的可变神经网络模型。通过将强化学习应用到众源数据和现实世界用户互动,该系统已经接受了培训,以便从各种模型的组合中选择适当的反应。该系统已经通过与现实世界用户的A/B测试进行了评估,其运行情况比许多相互竞争的系统要好得多。由于其机器学习结构,该系统有可能通过补充数据来改进。