非加速进化者时代的递增处理:对递增非加速进取单位双向模式的经验评估 (Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU)

While humans process language incrementally, the best language encoders currently used in NLP do not. Both bidirectional LSTMs and Transformers assume that the sequence that is to be encoded is available in full, to be processed either forwards and backwards (BiLSTMs) or as a whole (Transformers). We investigate how they behave under incremental interfaces, when partial output must be provided based on partial input seen up to a certain time step, which may happen in interactive systems. We test five models on various NLU datasets and compare their performance using three incremental evaluation metrics. The results support the possibility of using bidirectional encoders in incremental mode while retaining most of their non-incremental quality. The "omni-directional" BERT model, which achieves better non-incremental performance, is impacted more by the incremental access. This can be alleviated by adapting the training regime (truncated training), or the testing procedure, by delaying the output until some right context is available or by incorporating hypothetical right contexts generated by a language model like GPT-2.

翻译：虽然人类过程语言是渐进式的,但目前在NLP中使用的最佳语言编码器却不是。双向LSTMs和变换器都假定要编码的序列是完全的,可以进行前向和后向处理(BILSTMs)或整个(Transurds)处理。我们调查它们如何在递增界面下运作,如果部分输出必须基于在互动系统中可能发生的部分输入达到一定时间步骤,则部分输出必须提供。我们测试了五个关于各种NLU数据集的模型,并用三种递增评价指标比较它们的性能。结果支持了使用双向编码器的递增模式的可能性,同时保留了大多数非递增质量。“全向”BERT模型,它取得更好的非递增性功能,受增访问的影响更大。通过调整培训制度(调整培训),或测试程序,将输出延迟到某种合适的环境,或者通过纳入GPT-2等语言模型产生的假设性右环境,可以减轻这一影响。