提高议长承认的公平性 (Improving Fairness in Speaker Recognition)

The human voice conveys unique characteristics of an individual, making voice biometrics a key technology for verifying identities in various industries. Despite the impressive progress of speaker recognition systems in terms of accuracy, a number of ethical and legal concerns has been raised, specifically relating to the fairness of such systems. In this paper, we aim to explore the disparity in performance achieved by state-of-the-art deep speaker recognition systems, when different groups of individuals characterized by a common sensitive attribute (e.g., gender) are considered. In order to mitigate the unfairness we uncovered by means of an exploratory study, we investigate whether balancing the representation of the different groups of individuals in the training set can lead to a more equal treatment of these demographic groups. Experiments on two state-of-the-art neural architectures and a large-scale public dataset show that models trained with demographically-balanced training sets exhibit a fairer behavior on different groups, while still being accurate. Our study is expected to provide a solid basis for instilling beyond-accuracy objectives (e.g., fairness) in speaker recognition.

翻译：人类的声音传达个人的独特特征,使声音生物鉴别技术成为核查不同行业身份的关键技术。尽管在准确性方面,语音识别系统取得了令人印象深刻的进展,但人们提出了若干伦理和法律关切,特别是这种系统是否公平。在本文件中,我们的目标是探讨最先进的深层语音识别系统的表现差异,在考虑具有共同敏感属性(如性别)的不同群体时,为了减轻我们通过探索性研究发现的不公现象,我们调查在培训中平衡不同群体的代表性是否能导致更平等地对待这些人口群体。关于两种最先进的神经神经结构和大规模公共数据集的实验表明,经过人口平衡培训的模型显示不同群体的行为更加公平,同时仍然准确。我们的研究可望为在语音识别中灌输超越准确性的目标(如公平性)提供一个坚实的基础。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。