The goal of regression and classification methods in supervised learning is to minimize the empirical risk, that is, the expectation of some loss function quantifying the prediction error under the empirical distribution. When facing scarce training data, overfitting is typically mitigated by adding regularization terms to the objective that penalize hypothesis complexity. In this paper we introduce new regularization techniques using ideas from distributionally robust optimization, and we give new probabilistic interpretations to existing techniques. Specifically, we propose to minimize the worst-case expected loss, where the worst case is taken over the ball of all (continuous or discrete) distributions that have a bounded transportation distance from the (discrete) empirical distribution. By choosing the radius of this ball judiciously, we can guarantee that the worst-case expected loss provides an upper confidence bound on the loss on test data, thus offering new generalization bounds. We prove that the resulting regularized learning problems are tractable and can be tractably kernelized for many popular loss functions. We validate our theoretical out-of-sample guarantees through simulated and empirical experiments.
翻译:监督学习的回归和分类方法的目标是尽量减少经验性风险,即预期某些损失功能会量化经验性分布下的预测错误。在面临稀缺的培训数据时,通常会通过将正规化条款添加到惩罚假设复杂性的目标中来减少过度适应。在本文件中,我们采用新的正规化技术,采用分布式强优化的理念,并对现有技术进行新的概率解释。具体地说,我们提议尽量减少最坏的预期损失,即最坏的情况被接管所有(连续或离散的)分布式分布式(连续或离散的)分布式(连续或离散的)分布式(与(分散的)实证分布式分布式)之间的连接距离。我们通过明智地选择这一球的半径,我们可以保证最坏的预期损失为测试数据的损失提供了高度的自信,从而提供了新的概括性界限。我们证明由此产生的正规化学习问题很容易解决,并且可以对许多广受欢迎的损失功能进行易于消除。我们通过模拟和实验来验证理论上的溢价保证。