Deep learning holds immense promise for transforming medical image analysis, yet its clinical generalization remains profoundly limited. A major barrier is data heterogeneity. This is particularly true in Magnetic Resonance Imaging, where scanner hardware differences, diverse acquisition protocols, and varying sequence parameters introduce substantial domain shifts that obscure underlying biological signals. Data harmonization methods aim to reduce these instrumental and acquisition variability, but existing approaches remain insufficient. When applied to imaging data, image-based harmonization approaches are often restricted by the need for target images, while existing text-guided methods rely on simplistic labels that fail to capture complex acquisition details or are typically restricted to datasets with limited variability, failing to capture the heterogeneity of real-world clinical environments. To address these limitations, we propose DIST-CLIP (Disentangled Style Transfer with CLIP Guidance), a unified framework for MRI harmonization that flexibly uses either target images or DICOM metadata for guidance. Our framework explicitly disentangles anatomical content from image contrast, with the contrast representations being extracted using pre-trained CLIP encoders. These contrast embeddings are then integrated into the anatomical content via a novel Adaptive Style Transfer module. We trained and evaluated DIST-CLIP on diverse real-world clinical datasets, and showed significant improvements in performance when compared against state-of-the-art methods in both style translation fidelity and anatomical preservation, offering a flexible solution for style transfer and standardizing MRI data. Our code and weights will be made publicly available upon publication.
翻译:深度学习在医学图像分析领域展现出巨大潜力,但其临床泛化能力仍受到严重限制。主要障碍之一是数据异质性。这在磁共振成像中尤为突出,扫描仪硬件差异、多样的采集协议以及变化的序列参数引入了显著的域偏移,从而掩盖了潜在的生物信号。数据协调方法旨在减少这些仪器和采集变异性,但现有方法仍显不足。应用于成像数据时,基于图像的协调方法常受限于对目标图像的需求;而现有的文本引导方法依赖于过于简化的标签,无法捕捉复杂的采集细节,或通常仅限于变异性有限的数据集,难以反映真实临床环境的异质性。为应对这些局限,我们提出了DIST-CLIP(基于CLIP引导的解耦风格迁移),这是一个用于MRI协调的统一框架,可灵活使用目标图像或DICOM元数据进行引导。该框架明确将解剖内容与图像对比度解耦,其中对比度表示通过预训练的CLIP编码器提取。这些对比度嵌入随后通过新颖的自适应风格迁移模块整合到解剖内容中。我们在多样化的真实临床数据集上训练并评估了DIST-CLIP,结果显示其在风格转换保真度和解剖结构保留方面均优于现有先进方法,为风格迁移和MRI数据标准化提供了灵活的解决方案。我们的代码和权重将在发表后公开提供。