Practitioners apply neural networks to increasingly complex problems in natural language processing, such as syntactic parsing and semantic role labeling that have rich output structures. Many such structured-prediction problems require deterministic constraints on the output values; for example, in sequence-to-sequence syntactic parsing, we require that the sequential outputs encode valid trees. While hidden units might capture such properties, the network is not always able to learn such constraints from the training data alone, and practitioners must then resort to post-processing. In this paper, we present an inference method for neural networks that enforces deterministic constraints on outputs without performing rule-based post-processing or expensive discrete search. Instead, in the spirit of gradient-based training, we enforce constraints with gradient-based inference (GBI): for each input at test-time, we nudge continuous model weights until the network's unconstrained inference procedure generates an output that satisfies the constraints. We study the efficacy of GBI on three tasks with hard constraints: semantic role labeling, syntactic parsing, and sequence transduction. In each case, the algorithm not only satisfies constraints but improves accuracy, even when the underlying network is state-of-the-art.
翻译:开业者将神经网络应用于自然语言处理中日益复杂的问题,如合成分析以及具有丰富的产出结构的语义作用标签等。许多这类结构化预测问题要求对产出值施加决定性的限制;例如,在顺序到顺序的合成分析中,我们要求顺序产出对有效树进行编码。虽然隐藏单位可能捕捉这种特性,但网络并不总是能够从培训数据中了解这种限制,实践者随后必须诉诸后处理。在本文中,我们提出了一个神经网络的推论方法,这种神经网络在不进行基于规则的后处理或昂贵的离散搜索的情况下对产出施加确定性限制。相反,在基于梯度的培训精神下,我们用基于梯度的推断(GBI):对于每次测试时的输入,我们都会进行连续的模型权重,直到网络的不严格的推算程序产生一种满足这些限制的输出。我们研究了GBI在三种硬性任务上的功效: 语义作用标签、 合成力标签、 合成分析网络基础的精确度限制, 也就是每件的解算法的精确度限制。