Gender, race and social biases have recently been detected as evident examples of unfairness in applications of Natural Language Processing. A key path towards fairness is to understand, analyse and interpret our data and algorithms. Recent studies have shown that the human-generated data used in training is an apparent factor of getting biases. In addition, current algorithms have also been proven to amplify biases from data. To further address these concerns, in this paper, we study how an state-of-the-art recurrent neural language model behaves when trained on data, which under-represents females, using pre-trained standard and debiased word embeddings. Results show that language models inherit higher bias when trained on unbalanced data when using pre-trained embeddings, in comparison with using embeddings trained within the task. Moreover, results show that, on the same data, language models inherit lower bias when using debiased pre-trained emdeddings, compared to using standard pre-trained embeddings.
翻译:最近发现,性别、种族和社会偏见是应用自然语言处理中的不公平现象的明显例子。实现公平的一个重要途径是理解、分析和解释我们的数据和算法。最近的研究表明,培训中使用的人造数据是获得偏见的明显因素。此外,目前的算法也证明扩大了数据中的偏见。为了进一步解决这些问题,我们在本文件中研究,在进行数据培训时,最先进的经常神经语言模式会如何表现得更差,这种模式代表女性,使用预先培训的标准和贬低的字嵌入。结果显示,语言模型在使用预先培训的嵌入时,在接受不平衡数据培训时,与任务中经过培训的嵌入相比,会继承更多的偏向。此外,结果还表明,根据同样的数据,语言模型在使用事先培训的不偏向性嵌入时,与使用经过培训的标准嵌入相比,会继承较低的偏向性。