The flourishing assessments of fairness measure in machine learning algorithms have shown that dimension reduction methods such as PCA treat data from different sensitive groups unfairly. In particular, by aggregating data of different groups, the reconstruction error of the learned subspace becomes biased towards some populations that might hurt or benefit those groups inherently, leading to an unfair representation. On the other hand, alleviating the bias to protect sensitive groups in learning the optimal projection, would lead to a higher reconstruction error overall. This introduces a trade-off between sensitive groups' sacrifices and benefits, and the overall reconstruction error. In this paper, in pursuit of achieving fairness criteria in PCA, we introduce a more efficient notion of Pareto fairness, cast the Pareto fair dimensionality reduction as a multi-objective optimization problem, and propose an adaptive gradient-based algorithm to solve it. Using the notion of Pareto optimality, we can guarantee that the solution of our proposed algorithm belongs to the Pareto frontier for all groups, which achieves the optimal trade-off between those aforementioned conflicting objectives. This framework can be efficiently generalized to multiple group sensitive features, as well. We provide convergence analysis of our algorithm for both convex and non-convex objectives and show its efficacy through empirical studies on different datasets, in comparison with the state-of-the-art algorithm.
翻译:对机器学习算法中公平措施的蓬勃评估表明,诸如五氯苯甲醚等减少维度的方法对不同敏感群体的数据处理不公平,特别是,通过汇集不同群体的数据,所学子空间的重建错误偏向于某些可能伤害或有利于这些群体的人,从而导致不公平的代表性。另一方面,减少偏向以保护敏感群体了解最佳预测的偏向,将导致总体重建错误增加。这在敏感群体牺牲和惠益与总体重建错误之间引入了一种权衡。在本文件中,为了在五氯苯甲醚中实现公平标准,我们引入了一种更有效率的Pareto公平概念,将Pareto公平度的减少作为一个多目标优化问题,并提出一种适应性梯度的算法来解决该问题。我们可以利用Pareto最佳性的概念,保证我们提议的算法的解决方案属于所有群体的Pareto边界,从而实现上述相互冲突的目标之间的最佳权衡。这一框架可以有效地普及到多个群体敏感特征。此外,我们对我们的算法的趋同性进行了趋同分析,通过不同的实验性分析,通过不同的数据效率的比较,展示其不同的实验性分析。