Large Language Models (LLMs) have rapidly evolved from text-based systems to multimodal platforms, significantly impacting various sectors including healthcare. This comprehensive review explores the progression of LLMs to Multimodal Large Language Models (MLLMs) and their growing influence in medical practice. We examine the current landscape of MLLMs in healthcare, analyzing their applications across clinical decision support, medical imaging, patient engagement, and research. The review highlights the unique capabilities of MLLMs in integrating diverse data types, such as text, images, and audio, to provide more comprehensive insights into patient health. We also address the challenges facing MLLM implementation, including data limitations, technical hurdles, and ethical considerations. By identifying key research gaps, this paper aims to guide future investigations in areas such as dataset development, modality alignment methods, and the establishment of ethical guidelines. As MLLMs continue to shape the future of healthcare, understanding their potential and limitations is crucial for their responsible and effective integration into medical practice.
翻译:大语言模型已从基于文本的系统迅速发展为多模态平台,显著影响了包括医疗保健在内的多个领域。本文全面综述了大语言模型向多模态大语言模型的演进过程及其在医疗实践中日益增长的影响力。我们审视了MLLMs在医疗领域的现状,分析了其在临床决策支持、医学影像、患者参与及科研等方面的应用。该综述强调了MLLMs在整合文本、图像、音频等多元数据类型以提供更全面患者健康洞察方面的独特能力。同时,我们探讨了MLLM实施面临的挑战,包括数据局限性、技术障碍及伦理考量。通过识别关键研究空白,本文旨在为未来研究方向提供指引,例如数据集开发、模态对齐方法及伦理准则建立等领域。随着MLLMs持续塑造医疗健康的未来,理解其潜力与局限对于实现其在医疗实践中负责任且有效的整合至关重要。