Pre-trained language models have been known to perpetuate biases from the underlying datasets to downstream tasks. However, these findings are predominantly based on monolingual language models for English, whereas there are few investigative studies of biases encoded in language models for languages beyond English. In this paper, we fill this gap by analysing gender bias in West Slavic language models. We introduce the first template-based dataset in Czech, Polish, and Slovak for measuring gender bias towards male, female and non-binary subjects. We complete the sentences using both mono- and multilingual language models and assess their suitability for the masked language modelling objective. Next, we measure gender bias encoded in West Slavic language models by quantifying the toxicity and genderness of the generated words. We find that these language models produce hurtful completions that depend on the subject's gender. Perhaps surprisingly, Czech, Slovak, and Polish language models produce more hurtful completions with men as subjects, which, upon inspection, we find is due to completions being related to violence, death, and sickness.
翻译:预训练语言模型已知会将底层数据集中的偏见传承到下游任务中。然而,这些发现主要基于单语言种的英语语言模型,而对于英语以外的语言,有关编码偏见的调查研究相当少。在本文中,我们填补了此空白,通过分析西斯拉夫语族语言模型中的性别偏见来解决这个问题。我们引入了Czech、Polish和Slovak的第一个基于模板的数据集,用于衡量对男性、女性和非二元主体的性别偏见。我们使用单语言和多语言语言模型完成这些句子,并评估它们是否适合掩码语言建模目标。接下来,我们通过量化生成的单词的有毒性和性别特征,测量西斯拉夫语言模型中编码的性别偏见。我们发现,这些语言模型会产生会伤害人的完成,而这些完成受主体的性别的影响。或许令人惊讶的是,捷克语、斯洛伐克语和波兰语的语言模型在以男性为主体时会产生更多伤害性的完成,这是因为完成与暴力、死亡和疾病有关。