自编码器的自我预训练在医学图像分类和分割中的应用 (Self Pre-training with Masked Autoencoders for Medical Image Classification and Segmentation)

Masked Autoencoder (MAE) has recently been shown to be effective in pre-training Vision Transformers (ViT) for natural image analysis. By reconstructing full images from partially masked inputs, a ViT encoder aggregates contextual information to infer masked image regions. We believe that this context aggregation ability is particularly essential to the medical image domain where each anatomical structure is functionally and mechanically connected to other structures and regions. Because there is no ImageNet-scale medical image dataset for pre-training, we investigate a self pre-training paradigm with MAE for medical image analysis tasks. Our method pre-trains a ViT on the training set of the target data instead of another dataset. Thus, self pre-training can benefit more scenarios where pre-training data is hard to acquire. Our experimental results show that MAE self pre-training markedly improves diverse medical image tasks including chest X-ray disease classification, abdominal CT multi-organ segmentation, and MRI brain tumor segmentation. Code is available at https://github.com/cvlab-stonybrook/SelfMedMAE

翻译：---- 自编码器 (MAE) 最近被证明对于自然图像分析的 Transformer 预训练非常有效。通过从部分遮盖的输入重构整张图像，Transformer 编码器聚合上下文信息以推断遮盖的图像区域。我们认为这种上下文聚合能力在医学图像领域特别重要，因为每个解剖结构与其他结构和区域功能上都有联系。由于缺少 ImageNet 级别的医学图像数据集可供预训练，我们探究了一种自我预训练范式：采用 MAE 在目标数据的训练集上进行预训练，而不是使用其他数据集。因此，自我预训练可以更好地受益于预训练数据难以获取的情形。我们的实验结果表明，MAE 的自我预训练显著改善了不同的医学图像任务，包括胸部 X 光疾病分类、腹部 CT 多器官分割和 MRI 脑肿瘤分割。代码可在 https://github.com/cvlab-stonybrook/SelfMedMAE 上获取。

相关内容

医学图像

关注 84

医学影像是指为了医疗或医学研究，对人体或人体某部分，以非侵入方式取得内部组织影像的技术与处理过程。它包含以下两个相对独立的研究方向：医学成像系统（medical imaging system）和医学图像处理（medical image processing）。前者是指图像行成的过程，包括对成像机理、成像设备、成像系统分析等问题的研究；后者是指对已经获得的图像作进一步的处理，其目的是或者是使原来不够清晰的图像复原，或者是为了突出图像中的某些特征信息，或者是对图像做模式分类等等。

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【深度迁移学习在图像分类中的应用综述】Deep transfer learning for image classification: a survey

专知会员服务

25+阅读 · 2022年5月24日

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

专知会员服务

32+阅读 · 2022年3月12日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日