Multimodal Large Language Models (MLLMs) achieve remarkable capabilities but can inadvertently memorize privacy-sensitive information. Although existing unlearning methods can remove such knowledge, they fail to achieve benign forgetting because they often degrade the model's general image understanding performance. To address this, we propose the Sculpted Memory Forgetting Adapter (SMFA), which confines forgetting to targeted memory regions while preserving overall capabilities. SMFA first fine-tunes the model to replace sensitive responses with refusals, yielding a memory forgetting adapter, and then applies a retaining anchor-guided masking mechanism to prevent interference with unrelated knowledge and understanding ability. To systematically evaluate selective MLLM unlearning, we introduce S-MLLMUn Bench, the first benchmark designed to jointly assess the removal of sensitive knowledge and retention of general visual understanding. Extensive experiments show that, unlike prior methods, SMFA achieves precise and controllable unlearning while maintaining the model's foundational image understanding.
翻译:多模态大语言模型(MLLMs)展现出卓越的能力,但可能无意中记忆隐私敏感信息。尽管现有的遗忘方法能够移除此类知识,但它们未能实现良性遗忘,因为这些方法通常会降低模型的通用图像理解性能。为解决这一问题,我们提出了一种雕刻式记忆遗忘适配器(SMFA),该方法将遗忘限制在目标记忆区域,同时保留模型的整体能力。SMFA首先通过微调模型将敏感响应替换为拒绝回答,从而生成一个记忆遗忘适配器;随后应用保留锚点引导的掩码机制,以防止对无关知识与理解能力的干扰。为系统评估选择性MLLM遗忘,我们引入了S-MLLMUn Bench,这是首个旨在联合评估敏感知识移除与通用视觉理解保留的基准测试。大量实验表明,与先前方法不同,SMFA在保持模型基础图像理解能力的同时,实现了精确且可控的遗忘。