QoQ-Med：通过领域感知GRPO训练构建多模态临床基础模型 (QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training)

Clinical decision-making routinely demands reasoning over heterogeneous data, yet existing multimodal language models (MLLMs) remain largely vision-centric and fail to generalize across clinical specialties. To bridge this gap, we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time-series signals, and text reports. QoQ-Med is trained with Domain-aware Relative Policy Optimization (DRPO), a novel reinforcement-learning objective that hierarchically scales normalized rewards according to domain rarity and modality difficulty, mitigating performance imbalance caused by skewed clinical data distributions. Trained on 2.61 million instruction tuning pairs spanning 9 clinical domains, we show that DRPO training boosts diagnostic performance by 43% in macro-F1 on average across all visual domains as compared to other critic-free training methods like GRPO. Furthermore, with QoQ-Med trained on intensive segmentation data, it is able to highlight salient regions related to the diagnosis, with an IoU 10x higher than open models while reaching the performance of OpenAI o4-mini. To foster reproducibility and downstream research, we release (i) the full model weights, (ii) the modular training pipeline, and (iii) all intermediate reasoning traces at https://github.com/DDVD233/QoQ_Med.

翻译：临床决策通常需要对异构数据进行推理，然而现有的多模态语言模型（MLLMs）在很大程度上仍以视觉为中心，且无法泛化到不同的临床专科领域。为弥合这一差距，我们提出了QoQ-Med-7B/32B，这是首个开放的通用临床基础模型，能够联合推理医学图像、时间序列信号和文本报告。QoQ-Med采用领域感知相对策略优化（DRPO）进行训练，这是一种新颖的强化学习目标，它根据领域稀有性和模态难度分层缩放归一化奖励，从而缓解由临床数据分布偏斜导致的性能不平衡问题。该模型在跨越9个临床领域的261万个指令调优对上进行训练，结果显示，与其他无评论者训练方法（如GRPO）相比，DRPO训练将模型在所有视觉领域的平均宏观F1诊断性能提升了43%。此外，通过在密集分割数据上训练的QoQ-Med，能够高亮显示与诊断相关的显著区域，其交并比（IoU）比开源模型高出10倍，同时达到了OpenAI o4-mini的性能水平。为促进可重复性和下游研究，我们在https://github.com/DDVD233/QoQ_Med 上发布了（i）完整的模型权重，（ii）模块化训练流程，以及（iii）所有中间推理轨迹。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日