We present the first complete attempt at concurrently training conversational agents that communicate only via self-generated language. Using DSTC2 as seed data, we trained natural language understanding (NLU) and generation (NLG) networks for each agent and let the agents interact online. We model the interaction as a stochastic collaborative game where each agent (player) has a role ("assistant", "tourist", "eater", etc.) and their own objectives, and can only interact via natural language they generate. Each agent, therefore, needs to learn to operate optimally in an environment with multiple sources of uncertainty (its own NLU and NLG, the other agent's NLU, Policy, and NLG). In our evaluation, we show that the stochastic-game agents outperform deep learning based supervised baselines.
翻译:我们提出了同时培训仅通过自生语言进行交流的谈话代理人的第一次全面尝试。我们利用DSTC2作为种子数据,为每个代理人培训自然语言理解和生成网络,让代理人在线互动。我们把互动模拟为随机协作游戏,让每个代理人(玩家)发挥作用(“助理”、“旅游家”、“食客”等)和他们自己的目标,并且只能通过他们产生的自然语言进行互动。因此,每个代理人都需要学习在具有多种不确定性来源的环境中(其自己的NLU和NLG,其他代理人的NLU、Policy和NLG)进行最佳运作。在我们的评估中,我们展示了Schectic-游戏代理人超越了以深层次学习为基础的监督基线。