Cognitive diagnosis models (CDMs) are pivotal for creating fine-grained learner profiles in modern intelligent education platforms. However, these models are trained on sensitive student data, raising significant privacy concerns. While membership inference attacks (MIA) have been studied in various domains, their application to CDMs remains a critical research gap, leaving their privacy risks unquantified. This paper is the first to systematically investigate MIA against CDMs. We introduce a novel and realistic grey box threat model that exploits the explainability features of these platforms, where a model's internal knowledge state vectors are exposed to users through visualizations such as radar charts. We demonstrate that these vectors can be accurately reverse-engineered from such visualizations, creating a potent attack surface. Based on this threat model, we propose a profile-based MIA (P-MIA) framework that leverages both the model's final prediction probabilities and the exposed internal knowledge state vectors as features. Extensive experiments on three real-world datasets against mainstream CDMs show that our grey-box attack significantly outperforms standard black-box baselines. Furthermore, we showcase the utility of P-MIA as an auditing tool by successfully evaluating the efficacy of machine unlearning techniques and revealing their limitations.
翻译:认知诊断模型(CDMs)是现代智能教育平台中构建细粒度学习者画像的关键技术。然而,这些模型基于敏感的学生数据进行训练,引发了重大的隐私关切。尽管成员推理攻击(MIA)已在多个领域得到研究,但其在CDMs中的应用仍是一个关键的研究空白,导致其隐私风险未被量化。本文首次系统性地研究了针对CDMs的MIA。我们引入了一种新颖且现实的灰盒威胁模型,该模型利用了这些平台的可解释性特征——即模型内部的知识状态向量通过雷达图等可视化方式向用户公开。我们证明,这些向量可以从此类可视化中精确地逆向工程还原,从而形成一个强大的攻击面。基于此威胁模型,我们提出了一个基于画像的MIA(P-MIA)框架,该框架同时利用模型的最终预测概率和暴露的内部知识状态向量作为特征。在三个真实世界数据集上针对主流CDMs进行的大量实验表明,我们的灰盒攻击显著优于标准的黑盒基线方法。此外,我们通过成功评估机器遗忘技术的有效性并揭示其局限性,展示了P-MIA作为一种审计工具的实际效用。