We investigate the performance of sentence embeddings models on several tasks for the Russian language. In our comparison, we include such tasks as multiple choice question answering, next sentence prediction, and paraphrase identification. We employ FastText embeddings as a baseline and compare it to ELMo and BERT embeddings. We conduct two series of experiments, using both unsupervised (i.e., based on similarity measure only) and supervised approaches for the tasks. Finally, we present datasets for multiple choice question answering and next sentence prediction in Russian.
翻译:比较起来,我们把多选题回答、下句预测和句子识别等任务都包含在内。我们使用快速图嵌入作为基线,并将其与ELMO和BERT嵌入作比较。我们进行两系列实验,使用不受监督的(即仅基于相似度衡量标准)和受监督的任务方法。最后,我们用俄文为多选题回答和下一句预测提供数据集。