Even as pre-trained language encoders such as BERT are shared across many tasks, the output layers of question answering and text classification models are significantly different. Span decoders are frequently used for question answering and fixed-class, classification layers for text classification. We show that this distinction is not necessary, and that both can be unified as span extraction. A unified, span-extraction approach leads to superior or comparable performance in multi-task learning, low-data and supplementary supervised pretraining experiments on several text classification and question answering benchmarks.
翻译:即便诸如BERT等经过培训的语文编码员在许多任务中共享,问题回答和文本分类模型的产出层次也大不相同。 Span 解码器经常用于回答问题和固定级别、文本分类分类的分类层。我们表明,这种区分是没有必要的,两者在抽取时都可以统一。一个统一、跨范围的方法导致在多任务学习、低数据和补充性受监督的关于若干文本分类和问题回答基准的培训前实验中取得优异或可比的成绩。