The balanced loss is a widely adopted objective for multi-class classification under class imbalance. By assigning equal importance to all classes, regardless of their frequency, it promotes fairness and ensures that minority classes are not overlooked. However, directly minimizing the balanced classification loss is typically intractable, which makes the design of effective surrogate losses a central question. This paper introduces and studies two advanced surrogate loss families: Generalized Logit-Adjusted (GLA) loss functions and Generalized Class-Aware weighted (GCA) losses. GLA losses generalize Logit-Adjusted losses, which shift logits based on class priors, to the broader general cross-entropy loss family. GCA loss functions extend the standard class-weighted losses, which scale losses inversely by class frequency, by incorporating class-dependent confidence margins and extending them to the general cross-entropy family. We present a comprehensive theoretical analysis of consistency for both loss families. We show that GLA losses are Bayes-consistent, but only $H$-consistent for complete (i.e., unbounded) hypothesis sets. Moreover, their $H$-consistency bounds depend inversely on the minimum class probability, scaling at least as $1/\mathsf p_{\min}$. In contrast, GCA losses are $H$-consistent for any hypothesis set that is bounded or complete, with $H$-consistency bounds that scale more favorably as $1/\sqrt{\mathsf p_{\min}}$, offering significantly stronger theoretical guarantees in imbalanced settings. We report the results of experiments demonstrating that, empirically, both the GCA losses with calibrated class-dependent confidence margins and GLA losses can greatly outperform straightforward class-weighted losses as well as the LA losses. GLA generally performs slightly better in common benchmarks, whereas GCA exhibits a slight edge in highly imbalanced settings.
翻译:平衡损失是类别不平衡多类分类中广泛采用的目标函数。通过赋予所有类别同等重要性(无论其频率如何),它促进了公平性并确保少数类别不被忽视。然而,直接最小化平衡分类损失通常是难以处理的,这使得设计有效的代理损失成为一个核心问题。本文引入并研究了两类先进的代理损失函数族:广义对数调整(GLA)损失函数和广义类别感知加权(GCA)损失。GLA损失将对数调整损失(基于类别先验调整对数)推广到更广泛的通用交叉熵损失函数族。GCA损失函数扩展了标准的类别加权损失(按类别频率倒数缩放损失),通过纳入类别相关的置信度间隔并将其推广到通用交叉熵族。我们对两个损失函数族的一致性进行了全面的理论分析。我们证明,GLA损失是贝叶斯一致的,但仅对完备(即无界)假设集具有$H$一致性。此外,它们的$H$一致性界与最小类别概率成反比,其缩放至少为$1/\mathsf p_{\min}$。相比之下,GCA损失对于任何有界或完备的假设集都具有$H$一致性,其$H$一致性界以更有利的$1/\sqrt{\mathsf p_{\min}}$缩放,在不平衡设置中提供了显著更强的理论保证。我们报告了实验结果,表明在经验上,具有校准的类别相关置信度间隔的GCA损失和GLA损失都能大大优于简单的类别加权损失以及LA损失。在常见基准测试中,GLA通常表现略好,而在高度不平衡的设置中,GCA则显示出轻微优势。