Medical dialogue systems are promising in assisting in telemedicine to increase access to healthcare services, improve the quality of patient care, and reduce medical costs. To facilitate the research and development of medical dialogue systems, we build two large-scale medical dialogue datasets: MedDialog-EN and MedDialog-CN. MedDialog-EN is an English dataset containing 0.3 million conversations between patients and doctors and 0.5 million utterances. MedDialog-CN is an Chinese dataset containing 1.1 million conversations and 4 million utterances. To our best knowledge, MedDialog-(EN,CN) are the largest medical dialogue datasets to date. The dataset is available at https://github.com/UCSD-AI4H/Medical-Dialogue-System
翻译:医疗对话系统在协助远程医疗以增加获得保健服务的机会、提高病人护理质量和降低医疗费用方面大有希望,为便利医疗对话系统的研究和开发,我们建立了两个大型医疗对话数据集:MedDialog-EN和MedDialog-CN。MedDialog-EN是一套英文数据集,包含30万病人和医生之间的谈话和50万份讲稿。MedDialog-CN是中国数据集,包含110万次谈话和400万份讲稿。据我们所知,MedDialog-(EN,CN)是迄今为止最大的医疗对话数据集。该数据集可在https://github.com/UCSD-AI4H/Medical-Dialog-System查阅。