Neural language models (LM) trained on diverse corpora are known to work well on previously seen entities, however, updating these models with dynamically changing entities such as place names, song titles and shopping items requires re-training from scratch and collecting full sentences containing these entities. We aim to address this issue, by introducing entity-aware language models (EALM), where we integrate entity models trained on catalogues of entities into the pre-trained LMs. Our combined language model adaptively adds information from the entity models into the pre-trained LM depending on the sentence context. Our entity models can be updated independently of the pre-trained LM, enabling us to influence the distribution of entities output by the final LM, without any further training of the pre-trained LM. We show significant perplexity improvements on task-oriented dialogue datasets, especially on long-tailed utterances, with an ability to continually adapt to new entities (to an extent).
翻译:众所周知,经过各种公司培训的神经语言模型(LM)对以前所见的实体非常有效,然而,更新这些模型需要用动态变化的实体,如地名、歌名和购物物品等,从头到尾进行再培训和收集含有这些实体的完整句子;我们的目标是解决这一问题,采用实体认知语言模型(EALM),将经过实体目录培训的实体模型纳入培训前的LM;我们的综合语言模型根据判刑情况,将实体模型中的信息适应性地添加到培训前的LM中;我们的实体模型可以独立于经过培训的LM,在不对经过培训的LM进行进一步培训的情况下,影响最后LM的实体产出的分布。 我们在面向任务的对话数据集上,特别是在长尾话方面,表现出了重大的不易解之处,能够不断适应(在一定程度上)新的实体。