Conventional Learning-to-Rank (LTR) methods optimize the utility of the rankings to the users, but they are oblivious to their impact on the ranked items. However, there has been a growing understanding that the latter is important to consider for a wide range of ranking applications (e.g. online marketplaces, job placement, admissions). To address this need, we propose a general LTR framework that can optimize a wide range of utility metrics (e.g. NDCG) while satisfying fairness of exposure constraints with respect to the items. This framework expands the class of learnable ranking functions to stochastic ranking policies, which provides a language for rigorously expressing fairness specifications. Furthermore, we provide a new LTR algorithm called Fair-PG-Rank for directly searching the space of fair ranking policies via a policy-gradient approach. Beyond the theoretical evidence in deriving the framework and the algorithm, we provide empirical results on simulated and real-world datasets verifying the effectiveness of the approach in individual and group-fairness settings.
翻译:常规学习到兰克(LTR)方法优化了排名对用户的效用,但却忽视了排名对排名项目的影响,然而,人们日益认识到,后者对于考虑范围广泛的排名应用(例如在线市场、职位安排、招生)非常重要。为解决这一需要,我们提议了一个通用的LTR框架,该框架可以优化广泛的通用指标(例如NDCG),同时满足项目暴露限制的公平性。这个框架将可学习的排名功能的等级扩展为随机排序政策,为严格表达公平性规格提供了语言。此外,我们提供了一种新的LTR算法,称为Fair-PG-Rank,用于通过注重政策的方法直接探索公平排名政策的空间。除了从理论证据中推导出框架和算法外,我们还提供模拟和真实世界数据集的经验结果,以核实个人和群体公平环境中的方法的有效性。