We explore fairness from a statistical perspective by selectively utilizing either conditional distance covariance or distance covariance statistics as measures to assess the independence between predictions and sensitive attributes. We boost fairness with independence by adding a distance covariance-based penalty to the model's training. Additionally, we present the matrix form of empirical (conditional) distance covariance for parallel calculations to enhance computational efficiency. Theoretically, we provide a proof for the convergence between empirical and population (conditional) distance covariance, establishing necessary guarantees for batch computations. Through experiments conducted on a range of real-world datasets, we have demonstrated that our method effectively bridges the fairness gap in machine learning. Our code is available at \url{https://github.com/liuhaixias1/Fair_dc/}.
翻译:本文从统计学视角探讨公平性问题,通过选择性地采用条件距离协方差或距离协方差统计量作为评估预测结果与敏感属性间独立性的度量指标。我们通过在模型训练中添加基于距离协方差的惩罚项,借助独立性原则提升公平性。此外,我们提出了经验(条件)距离协方差的矩阵形式以实现并行计算,从而提升计算效率。在理论层面,我们证明了经验(条件)距离协方差与总体(条件)距离协方差之间的收敛性,为批量计算提供了必要的理论保证。通过在多个真实数据集上进行的实验,我们验证了所提方法能有效弥合机器学习中的公平性差距。代码已开源:\url{https://github.com/liuhaixias1/Fair_dc/}。