Speech emotion recognition aims to identify emotional states from speech signals and has been widely applied in human-computer interaction, education, healthcare, and many other fields. However, since speech data contain rich sensitive information, partial data can be required to be deleted by speakers due to privacy concerns. Current machine unlearning approaches largely depend on data beyond the samples to be forgotten. However, this reliance poses challenges when data redistribution is restricted and demands substantial computational resources in the context of big data. We propose a novel adversarial-attack-based approach that fine-tunes a pre-trained speech emotion recognition model using only the data to be forgotten. The experimental results demonstrate that the proposed approach can effectively remove the knowledge of the data to be forgotten from the model, while preserving high model performance on the test set for emotion recognition.
翻译:语音情感识别旨在从语音信号中识别情感状态,已广泛应用于人机交互、教育、医疗等诸多领域。然而,由于语音数据包含丰富的敏感信息,出于隐私考虑,说话者可能要求删除部分数据。当前的机器遗忘方法在很大程度上依赖于待遗忘样本之外的数据。然而,这种依赖性在数据再分发受限时带来挑战,并在大数据背景下需要大量计算资源。我们提出一种基于对抗攻击的新方法,仅使用待遗忘数据对预训练的语音情感识别模型进行微调。实验结果表明,所提方法能有效从模型中移除待遗忘数据的知识,同时在情感识别的测试集上保持较高的模型性能。