泛流图像分层和分析一般深度学习框架 (A Generalized Deep Learning Framework for Whole-Slide Image Segmentation and Analysis)

Histopathology tissue analysis is considered the gold standard in cancer diagnosis and prognosis. Given the large size of these images and the increase in the number of potential cancer cases, an automated solution as an aid to histopathologists is highly desirable. In the recent past, deep learning-based techniques have provided state of the art results in a wide variety of image analysis tasks, including analysis of digitized slides. However, the size of images and variability in histopathology tasks makes it a challenge to develop an integrated framework for histopathology image analysis. We propose a deep learning-based framework for histopathology tissue analysis. We demonstrate the generalizability of our framework, including training and inference, on several open-source datasets, which include CAMELYON (breast cancer metastases), DigestPath (colon cancer), and PAIP (liver cancer) datasets. We discuss multiple types of uncertainties pertaining to data and model, namely aleatoric and epistemic, respectively. Simultaneously, we demonstrate our model generalization across different data distribution by evaluating some samples on TCGA data. On CAMELYON16 test data (n=139) for the task of lesion detection, the FROC score achieved was 0.86 and in the CAMELYON17 test-data (n=500) for the task of pN-staging the Cohen's kappa score achieved was 0.9090 (third in the open leaderboard). On DigestPath test data (n=212) for the task of tumor segmentation, a Dice score of 0.782 was achieved (fourth in the challenge). On PAIP test data (n=40) for the task of viable tumor segmentation, a Jaccard Index of 0.75 (third in the challenge) was achieved, and for viable tumor burden, a score of 0.633 was achieved (second in the challenge). Our entire framework and related documentation are freely available at GitHub and PyPi.

翻译：直系病理组织分析被视为癌症诊断和预测中的金标准。鉴于这些图像的庞大规模和潜在癌症病例数量的增加,非常需要一种自动化解决方案来帮助直系病理学家。在过去,深层次的学习技术为各种图像分析任务提供了最新结果,包括数字化幻灯片分析。然而,生理病理学任务中图像的大小和变异性使得为直系病理学图像分析开发一个综合框架是一项挑战。我们建议为直系病理学组织分析建立一个基于深层次学习的框架。我们展示了我们框架的可概括性,包括培训和推断,用于一些开源数据集,包括CAMELYON(乳腺癌转移)、CEmpletPath(结肠癌)和PIP(肝癌)数据集。我们讨论了与数据和模型有关的多种类型的不确定性,即液细胞病理学和感应感应分析。我们展示了我们通过评估某些样本在直系病理学组织数据上的模型分布的全局性挑战。在CAMELEON=结果的直径数据测试任务中(CAL516的直径测试任务中实现的直径测试任务)。