《两个长尾故事的故事》 (A Tale Of Two Long Tails)

As machine learning models are increasingly employed to assist human decision-makers, it becomes critical to communicate the uncertainty associated with these model predictions. However, the majority of work on uncertainty has focused on traditional probabilistic or ranking approaches - where the model assigns low probabilities or scores to uncertain examples. While this captures what examples are challenging for the model, it does not capture the underlying source of the uncertainty. In this work, we seek to identify examples the model is uncertain about and characterize the source of said uncertainty. We explore the benefits of designing a targeted intervention - targeted data augmentation of the examples where the model is uncertain over the course of training. We investigate whether the rate of learning in the presence of additional information differs between atypical and noisy examples? Our results show that this is indeed the case, suggesting that well-designed interventions over the course of training can be an effective way to characterize and distinguish between different sources of uncertainty.

翻译：由于越来越多地使用机器学习模型来协助人类决策者,因此交流与这些模型预测有关的不确定性变得至关重要;然而,关于不确定性的大多数工作侧重于传统的概率或排名方法,即模型给不确定的例子分配概率低或分数低;这抓住了哪些实例对模型具有挑战性,但没有抓住不确定性的根本原因;在这项工作中,我们力求找出模型不确定的范例,并描述不确定性的来源;我们探讨了设计有针对性的干预措施的好处——在培训过程中模型不确定的示例中增加有针对性的数据。我们调查在有额外信息的情况下学习率在非典型和吵闹的例子之间是否有所不同?我们的结果显示,情况确实如此,表明在培训过程中设计完善的干预措施能够有效地辨别和区分不同的不确定性来源。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/