Collecting and annotating task-oriented dialogues is time-consuming and costly; thus, zero and few shot learning could greatly benefit dialogue state tracking (DST). In this work, we propose an in-context learning (ICL) framework for zero-shot and few-shot learning DST, where a large pre-trained language model (LM) takes a test instance and a few exemplars as input, and directly decodes the dialogue state without any parameter updates. To better leverage a tabular domain description in the LM prompt, we reformulate DST into a text-to-SQL problem. We also propose a novel approach to retrieve annotated dialogues as exemplars. Empirical results on MultiWOZ show that our method IC-DST substantially outperforms previous fine-tuned state-of-the-art models in few-shot settings. In addition, we test IC-DST in zero-shot settings, in which the model only takes a fixed task instruction as input, finding that it outperforms previous zero-shot methods by a large margin.
翻译:收集和说明面向任务的对话既费时又费钱;因此,零和少手笔的学习可以极大地有利于对话国家跟踪。 在这项工作中,我们建议为零点和几分的学习DST建立一个内线学习框架(ICL),在这个框架中,一个大型的预先培训语言模型(LM)将一个测试实例和几个示例作为输入,并在没有任何参数更新的情况下直接解码对话状态。为了更好地利用LM 提示中的表格域描述,我们将DST改写成文本到SQL 问题。我们还提出了一个新颖的方法来检索附加说明的对话,作为外观。多功能Z的实证结果显示,我们的IC-DST方法在几发环境中大大超越了先前微调的艺术状态模型。此外,我们还在零点设置中测试IC-DST,在零点设置中,该模型只将固定的任务指令作为输入,发现它比以往的零点方法大幅度。