Given a prediction task, understanding when one can and cannot design a consistent convex surrogate loss, particularly a low-dimensional one, is an important and active area of machine learning research. The prediction task may be given as a target loss, as in classification and structured prediction, or simply as a (conditional) statistic of the data, as in risk measure estimation. These two scenarios typically involve different techniques for designing and analyzing surrogate losses. We unify these settings using tools from property elicitation, and give a general lower bound on prediction dimension. Our lower bound tightens existing results in the case of discrete predictions, showing that previous calibration-based bounds can largely be recovered via property elicitation. For continuous estimation, our lower bound resolves on open problem on estimating measures of risk and uncertainty.
翻译:根据预测任务,在人们能够并且不能设计一致的 convex代谢损失,特别是低维代谢损失时理解,是机器学习研究的一个重要和活跃的领域。预测任务可被定为目标损失,如分类和结构化预测,或只是数据(有条件)统计,如风险估测。这两种假设通常涉及设计和分析代谢损失的不同技术。我们使用财产引出工具将这些设置统一起来,对预测层面给予一般较低的约束。我们较低的约束拉紧了离散预测的现有结果,表明以前的校准界限在很大程度上可以通过财产引出来收回。为了持续估计,我们在估计风险和不确定性的衡量方面对开放问题的解决办法较低。