There are concerns that neural language models may preserve some of the stereotypes of the underlying societies that generate the large corpora needed to train these models. For example, gender bias is a significant problem when generating text, and its unintended memorization could impact the user experience of many applications (e.g., the smart-compose feature in Gmail). In this paper, we introduce a novel architecture that decouples the representation learning of a neural model from its memory management role. This architecture allows us to update a memory module with an equal ratio across gender types addressing biased correlations directly in the latent space. We experimentally show that our approach can mitigate the gender bias amplification in the automatic generation of articles news while providing similar perplexity values when extending the Sequence2Sequence architecture.
翻译:有人担心神经语言模式可能会保留一些基本社会的传统观念,这些传统观念产生培养这些模式所需的大型公司。例如,在生成文本时,性别偏见是一个严重问题,其意外的记忆化会影响许多应用的用户经验(例如Gmail的智能合成特征)。在本文中,我们引入了一个新的结构,将神经模型的代言学习与其记忆管理作用脱钩。这一结构使我们能够更新一个记忆模块,在性别类型之间以平等的比例处理潜在空间的偏向相关关系。我们实验性地表明,我们的方法可以减少自动生成文章新闻时性别偏见的放大,同时在扩展序列2序列结构时提供类似的重复价值。