Beyond assigning the correct class, an activity recognition model should also be able to determine, how certain it is in its predictions. We present the first study of how welthe confidence values of modern action recognition architectures indeed reflect the probability of the correct outcome and propose a learning-based approach for improving it. First, we extend two popular action recognition datasets with a reliability benchmark in form of the expected calibration error and reliability diagrams. Since our evaluation highlights that confidence values of standard action recognition architectures do not represent the uncertainty well, we introduce a new approach which learns to transform the model output into realistic confidence estimates through an additional calibration network. The main idea of our Calibrated Action Recognition with Input Guidance (CARING) model is to learn an optimal scaling parameter depending on the video representation. We compare our model with the native action recognition networks and the temperature scaling approach - a wide spread calibration method utilized in image classification. While temperature scaling alone drastically improves the reliability of the confidence values, our CARING method consistently leads to the best uncertainty estimates in all benchmark settings.
翻译:除了指定正确的类别之外,一个活动识别模型还应该能够确定,它的预测中是如何确定的。我们提出了关于现代行动识别结构的信任值如何真正反映正确结果的概率的第一项研究,并提出了改进这一结果的学习方法。首先,我们推广了两个具有可靠基准的大众行动识别数据集,其格式为预期校准错误和可靠性图表。由于我们的评估强调标准行动识别结构的可信度并不代表不确定性很好,我们引入了一种新的方法,通过额外的校准网络学会将模型输出转化为现实的信任估计。我们校准行动识别模型的主要想法是学习一个取决于视频代表的最佳比例参数。我们比较了我们的模型与本地行动识别网络和温度缩放方法——一种在图像分类中使用的宽广的校准方法。虽然温度缩放单是大幅提高信任值的可靠性,但我们的调控方法始终导致在所有基准环境中的最佳不确定性估计。