优秀的PAC下界不依赖于一致收敛性 (Optimal PAC Bounds Without Uniform Convergence)

In statistical learning theory, determining the sample complexity of realizable binary classification for VC classes was a long-standing open problem. The results of Simon and Hanneke established sharp upper bounds in this setting. However, the reliance of their argument on the uniform convergence principle limits its applicability to more general learning settings such as multiclass classification. In this paper, we address this issue by providing optimal high probability risk bounds through a framework that surpasses the limitations of uniform convergence arguments. Our framework converts the leave-one-out error of permutation invariant predictors into high probability risk bounds. As an application, by adapting the one-inclusion graph algorithm of Haussler, Littlestone, and Warmuth, we propose an algorithm that achieves an optimal PAC bound for binary classification. Specifically, our result shows that certain aggregations of one-inclusion graph algorithms are optimal, addressing a variant of a classic question posed by Warmuth. We further instantiate our framework in three settings where uniform convergence is provably suboptimal. For multiclass classification, we prove an optimal risk bound that scales with the one-inclusion hypergraph density of the class, addressing the suboptimality of the analysis of Daniely and Shalev-Shwartz. For partial hypothesis classification, we determine the optimal sample complexity bound, resolving a question posed by Alon, Hanneke, Holzman, and Moran. For realizable bounded regression with absolute loss, we derive an optimal risk bound that relies on a modified version of the scale-sensitive dimension, refining the results of Bartlett and Long. Our rates surpass standard uniform convergence-based results due to the smaller complexity measure in our risk bound.

翻译：在统计学习理论中，确定具有VC类别的可实现二元分类的样本复杂性一直是个长期未解决的问题。Simon和Hanneke的研究成果在这种情况下建立了尖锐的上界。然而，他们的论点依赖于一致收敛性原理，这限制了其适用性，不能推广到更一般的学习环境，例如多类分类。在本文中，我们通过一个超越一致收敛性的框架，提供了最优的高概率风险下界，以解决这个问题。我们的框架将排列不变预测器的留一发错误转换为高概率风险下界。作为应用，我们通过改编Haussler、Littlestone和Warmuth的单包含图算法，提出了一个在二元分类中实现最优PAC下界的算法。具体来说，我们的结果表明，某些单包含图算法的聚合是最优的，这解决了Warmuth提出的一个经典问题的一个变体。我们还在三种统计学习环境中说明了我们的框架。其中，对于多类别分类，我们证明了与类别的单包含超图密度成比例的最优风险界，解决了Daniely和Shalev-Shwartz分析的次优性。对于部分假设分类，我们确定了最优的样本复杂性边界，解决了Alon、Hanneke、Holzman和Moran提出的问题。对于具有绝对损失的实现有界回归，我们得出了最优的风险界，该风险界依赖于规模敏感维度的修改版本，从而完善了Bartlett和Long的结果。我们的下界超过了标准的基于一致收敛性的结果，因为我们的风险下界复杂度度量更小。