The growing reliance on Artificial Intelligence (AI) in critical domains such as healthcare demands robust mechanisms to ensure the trustworthiness of these systems, especially when faced with unexpected or anomalous inputs. This paper introduces the Open Medical Imaging Benchmarks for Out-Of-Distribution Detection (OpenMIBOOD), a comprehensive framework for evaluating out-of-distribution (OOD) detection methods specifically in medical imaging contexts. OpenMIBOOD includes three benchmarks from diverse medical domains, encompassing 14 datasets divided into covariate-shifted in-distribution, near-OOD, and far-OOD categories. We evaluate 24 post-hoc methods across these benchmarks, providing a standardized reference to advance the development and fair comparison of OOD detection methods. Results reveal that findings from broad-scale OOD benchmarks in natural image domains do not translate to medical applications, underscoring the critical need for such benchmarks in the medical field. By mitigating the risk of exposing AI models to inputs outside their training distribution, OpenMIBOOD aims to support the advancement of reliable and trustworthy AI systems in healthcare. The repository is available at https://github.com/remic-othr/OpenMIBOOD.
翻译:在医疗等关键领域,人工智能(AI)的应用日益广泛,这要求建立稳健的机制以确保这些系统的可信度,尤其是在面对意外或异常输入时。本文介绍了用于分布外检测的开放医学影像基准(OpenMIBOOD),这是一个专门用于评估医学影像场景中分布外(OOD)检测方法的综合性框架。OpenMIBOOD 包含来自不同医学领域的三个基准,涵盖 14 个数据集,这些数据集被划分为协变量偏移的分布内、近分布外和远分布外类别。我们在这些基准上评估了 24 种后处理方法,为推进 OOD 检测方法的发展和公平比较提供了标准化参考。结果表明,在自然图像领域大规模 OOD 基准上的发现并不能直接迁移到医学应用中,这凸显了在医学领域建立此类基准的迫切需求。通过降低 AI 模型暴露于其训练分布之外输入的风险,OpenMIBOOD 旨在支持医疗保健领域可靠且可信的 AI 系统的发展。相关代码库可在 https://github.com/remic-othr/OpenMIBOOD 获取。