语义理解(NLU)是通过一系列的AI算法,将文本解析为结构化的、机器可读的意图与词槽信息,便于互联网开发者更好的理解并满足用户需求。 思知AI机器人开放平台面向互联网开发者提供对自然语言文本的语义理解服务。

VIP内容

题目

不解析,生成!用于面向任务的语义分析的序列到序列体系结构,Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

类型

自然语言语义解析

关键字

自然语言理解与生成,语义解析,智能搜索查询,智能语音助手,机器学习

简介

诸如Amazon Alexa,Apple Siri和GoogleAssistant之类的虚拟助手通常依靠语义解析组件来了解要执行哪些操作以使其用户说出一句话。传统上,基于规则或统计空位填充系统曾被用来解析“简单”查询;也就是说,包含单个动作的查询可以分解为一组不重叠的实体。最近,提出了移位减少解析器来处理更复杂的话语。这些方法虽然功能强大,但对可以解析的查询类型施加了特定的限制。在这项工作中,我们提出了一种基于顺序序列模型和指针生成器网络的统一体系结构,以处理简单查询和复杂查询。与其他作品不同,我们的方法不对语义剖析施加任何限制。此外,实验表明,它在三个公开可用的数据集(ATIS,SNIPS,Facebook TOP)上均达到了最先进的性能,与以前的系统相比,不精确匹配的准确性相对提高了3.3%至7.7%。最后,我们在两个内部数据集上展示了我们的方法的有效性。

作者

Subendhu Rongali∗,马萨诸塞大学阿默斯特分校

Luca Soldaini,亚马逊Alexa搜索

Wael Hamza,亚马逊Alexa

Emilio Monti,亚马逊Alexa AI

成为VIP会员查看完整内容
0
8

最新论文

Natural Language Understanding (NLU) is a vital component of dialogue systems, and its ability to detect Out-of-Domain (OOD) inputs is critical in practical applications, since the acceptance of the OOD input that is unsupported by the current system may lead to catastrophic failure. However, most existing OOD detection methods rely heavily on manually labeled OOD samples and cannot take full advantage of unlabeled data. This limits the feasibility of these models in practical applications. In this paper, we propose a novel model to generate high-quality pseudo OOD samples that are akin to IN-Domain (IND) input utterances, and thereby improves the performance of OOD detection. To this end, an autoencoder is trained to map an input utterance into a latent code. and the codes of IND and OOD samples are trained to be indistinguishable by utilizing a generative adversarial network. To provide more supervision signals, an auxiliary classifier is introduced to regularize the generated OOD samples to have indistinguishable intent labels. Experiments show that these pseudo OOD samples generated by our model can be used to effectively improve OOD detection in NLU. Besides, we also demonstrate that the effectiveness of these pseudo OOD data can be further improved by efficiently utilizing unlabeled data.

0
0
下载
预览
Top