We formalize human language understanding as a structured prediction task where the output is a partially ordered set (poset). Current encoder-decoder architectures do not take the poset structure of semantics into account properly, thus suffering from poor compositional generalization ability. In this paper, we propose a novel hierarchical poset decoding paradigm for compositional generalization in language. Intuitively: (1) the proposed paradigm enforces partial permutation invariance in semantics, thus avoiding overfitting to bias ordering information; (2) the hierarchical mechanism allows to capture high-level structures of posets. We evaluate our proposed decoder on Compositional Freebase Questions (CFQ), a large and realistic natural language question answering dataset that is specifically designed to measure compositional generalization. Results show that it outperforms current decoders.
翻译:我们正式将人类语言理解作为一种结构化的预测任务,其中产出是部分定序的(空格)组。目前的编码器解码器结构没有适当地考虑到语义结构的构成结构,因此受到组成普遍化能力差的困扰。在本文中,我们建议为语言的构成概括化提出一个新的等级结构化解码范式。直观地说:(1) 拟议的范式在语义学中实行部分差异化,从而避免过分适应偏差订购信息;(2) 等级机制允许捕捉高等级的方言结构。我们评估了我们提议的关于组成自由基问题(CFQ)的解码器,这是一个大而现实的自然语言问题,回答专门用来衡量构成概括化的数据集。结果显示,它优于目前的解码器。