Latent factor models for recommender systems represent users and items as low dimensional vectors. Privacy risks of such systems have previously been studied mostly in the context of recovery of personal information in the form of usage records from the training data. However, the user representations themselves may be used together with external data to recover private user information such as gender and age. In this paper we show that user vectors calculated by a common recommender system can be exploited in this way. We propose the privacy-adversarial framework to eliminate such leakage of private information, and study the trade-off between recommender performance and leakage both theoretically and empirically using a benchmark dataset. An advantage of the proposed method is that it also helps guarantee fairness of results, since all implicit knowledge of a set of attributes is scrubbed from the representations used by the model, and thus can't enter into the decision making. We discuss further applications of this method towards the generation of deeper and more insightful recommendations.
翻译:推荐人系统的隐私风险以前主要在从培训数据中以使用记录的形式收集个人信息的范围内研究过,然而,用户的表述本身可能与外部数据一起使用,以收集私人用户信息,例如性别和年龄;在本文中,我们表明,由共同推荐人系统计算的用户矢量可以这样加以利用;我们提出隐私对抗框架,以消除私人信息的这种渗漏,并使用基准数据集研究建议人业绩与渗漏之间的平衡,同时在理论上和经验上研究建议人业绩与渗漏之间的权衡。拟议方法的一个优点是,它也有助于保证结果的公平性,因为对一组属性的所有隐含知识都从模型使用的表述中清除出来,因此无法进入决策阶段。我们讨论在提出更深入和更有洞察力的建议时进一步应用这一方法。