语音、偏见与共指：语音翻译中性别特征的解析性研究 (Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation)

Unlike text, speech conveys information about the speaker, such as gender, through acoustic cues like pitch. This gives rise to modality-specific bias concerns. For example, in speech translation (ST), when translating from languages with notional gender, such as English, into languages where gender-ambiguous terms referring to the speaker are assigned grammatical gender, the speaker's vocal characteristics may play a role in gender assignment. This risks misgendering speakers, whether through masculine defaults or vocal-based assumptions. Yet, how ST models make these decisions remains poorly understood. We investigate the mechanisms ST models use to assign gender to speaker-referring terms across three language pairs (en-es/fr/it), examining how training data patterns, internal language model (ILM) biases, and acoustic information interact. We find that models do not simply replicate term-specific gender associations from training data, but learn broader patterns of masculine prevalence. While the ILM exhibits strong masculine bias, models can override these preferences based on acoustic input. Using contrastive feature attribution on spectrograms, we reveal that the model with higher gender accuracy relies on a previously unknown mechanism: using first-person pronouns to link gendered terms back to the speaker, accessing gender information distributed across the frequency spectrum rather than concentrated in pitch.

翻译：与文本不同，语音通过音高等声学线索传递说话者的性别等信息，这引发了模态特定的偏见问题。例如，在语音翻译（ST）中，当从具有概念性性别的语言（如英语）翻译到那些指代说话者的性别模糊词汇被赋予语法性别的语言时，说话者的声音特征可能在性别分配中发挥作用。这可能导致性别误判的风险，无论是通过男性默认设定还是基于声音的假设。然而，ST模型如何做出这些决策仍鲜为人知。我们研究了ST模型在三种语言对（英语-西班牙语/法语/意大利语）中为指代说话者的词汇分配性别的机制，探讨了训练数据模式、内部语言模型（ILM）偏见和声学信息如何相互作用。我们发现，模型并非简单地复制训练数据中特定词汇的性别关联，而是学习了更广泛的男性主导模式。尽管ILM表现出强烈的男性偏见，但模型可以根据声学输入覆盖这些偏好。通过对频谱图进行对比性特征归因，我们揭示了性别准确度更高的模型依赖于一种先前未知的机制：使用第一人称代词将性别化词汇链接回说话者，从而访问分布在频谱而非集中于音高的性别信息。