Neural sequence-to-sequence models are currently the predominant choice for language generation tasks. Yet, on word-level tasks, exact inference of these models reveals the empty string is often the global optimum. Prior works have speculated this phenomenon is a result of the inadequacy of neural models for language generation. However, in the case of morphological inflection, we find that the empty string is almost never the most probable solution under the model. Further, greedy search often finds the global optimum. These observations suggest that the poor calibration of many neural models may stem from characteristics of a specific subset of tasks rather than general ill-suitedness of such models for language generation.
翻译:时间序列到顺序模型目前是语言生成任务的主要选择。 然而,在字级任务上,这些模型的精确推论表明空字符串往往是全球最佳的。 先前的工程推测,这一现象是语言生成神经模型不足的结果。 然而,在形态变化方面,我们发现空字符串几乎绝非该模型下最可能的解决办法。 此外,贪婪的搜索往往发现全球最佳。 这些观察显示,许多神经模型的校准不完善,可能源于特定任务组的特点,而不是这类模型对于语言生成的普遍不合适性。