We describe the Uppsala NLP submission to SemEval-2021 Task 2 on multilingual and cross-lingual word-in-context disambiguation. We explore the usefulness of three pre-trained multilingual language models, XLM-RoBERTa (XLMR), Multilingual BERT (mBERT) and multilingual distilled BERT (mDistilBERT). We compare these three models in two setups, fine-tuning and as feature extractors. In the second case we also experiment with using dependency-based information. We find that fine-tuning is better than feature extraction. XLMR performs better than mBERT in the cross-lingual setting both with fine-tuning and feature extraction, whereas these two models give a similar performance in the multilingual setting. mDistilBERT performs poorly with fine-tuning but gives similar results to the other models when used as a feature extractor. We submitted our two best systems, fine-tuned with XLMR and mBERT.
翻译:我们描述了Uppsala NLP向SemEval 2021任务2提交的关于多语种和跨语种文本的文字脱节的呈件。我们探讨了三种经过预先训练的多语种模式,XLM-ROBERTA(XLM(XLMR)、多语种BERT(MBERT)和多语种蒸馏的BERT(mDDISTERT)的效用。我们用两个设置,微调和特征提取器比较了这三种模式。在第二个案例中,我们也试验使用基于依赖的信息。我们发现,微调比特征提取要好。在跨语种环境中,XLMR在微调和特征提取方面都比MBERT表现得好,而这两种模式在多语种环境中的表现相似。MDTISTLBERT在微调方面表现很差,但在用作特征提取器时也与其他模式产生类似的结果。我们提交了两种最好的系统,与XLMR和 mBERT进行微调改的系统。