We conduct an empirical evaluation of extrapolation performance when conditioning on scalar control inputs like desired output length, desired edit from an input sentence, and desired sentiment across three text generation tasks. Specifically, we examine a zero-shot setting where models are asked to generalize to ranges of control values not seen during training. We focus on evaluating popular embedding methods for scalar inputs, including both learnable and sinusoidal embeddings, as well as simpler approaches. Surprisingly, our findings indicate that the simplest strategy of using scalar inputs directly, without further encoding, most reliably allows for successful extrapolation.
翻译:在对诸如理想输出长度、输入句子的编辑和三种文本生成任务的理想情绪等标量控制投入进行限制时,我们对外推性能进行实证评估。 具体地说,我们检查了一种零光设置,要求模型对培训期间未见的控制值范围进行普及。 我们侧重于评估用于标量输入的流行嵌入方法,包括可学和类固醇嵌入,以及更简单的方法。 令人惊讶的是,我们的调查结果表明,直接使用标量输入的最简单策略是直接使用标量输入,而无需进一步的编码,最可靠的策略是允许成功的外推法。