Deep Neural Networks (DNNs) have demonstrated remarkable performance across various domains, including computer vision and natural language processing. However, they often struggle to accurately quantify the uncertainty of their predictions, limiting their broader adoption in critical real-world applications. Uncertainty Quantification (UQ) for Deep Learning seeks to address this challenge by providing methods to improve the reliability of uncertainty estimates. Although numerous techniques have been proposed, a unified tool offering a seamless workflow to evaluate and integrate these methods remains lacking. To bridge this gap, we introduce Torch-Uncertainty, a PyTorch and Lightning-based framework designed to streamline DNN training and evaluation with UQ techniques and metrics. In this paper, we outline the foundational principles of our library and present comprehensive experimental results that benchmark a diverse set of UQ methods across classification, segmentation, and regression tasks. Our library is available at https://github.com/ENSTA-U2IS-AI/Torch-Uncertainty
翻译:深度神经网络(DNNs)在计算机视觉和自然语言处理等多个领域展现出了卓越的性能。然而,它们往往难以准确量化其预测的不确定性,这限制了其在关键现实应用中的广泛采用。深度学习中的不确定性量化(UQ)旨在通过提供改进不确定性估计可靠性的方法来应对这一挑战。尽管已有众多技术被提出,但一个提供无缝工作流以评估和集成这些方法的统一工具仍然缺乏。为填补这一空白,我们引入了Torch-Uncertainty,这是一个基于PyTorch和Lightning的框架,旨在通过UQ技术和指标简化DNN的训练与评估。本文概述了我们库的基本原理,并展示了全面的实验结果,这些结果对分类、分割和回归任务中的多种UQ方法进行了基准测试。我们的库可在https://github.com/ENSTA-U2IS-AI/Torch-Uncertainty获取。