Existing Clinical Decision Support Systems (CDSSs) largely depend on the availability of structured patient data and Electronic Health Records (EHRs) to aid caregivers. However, in case of hospitals in developing countries, structured patient data formats are not widely adopted, where medical professionals still rely on clinical notes in the form of unstructured text. Such unstructured clinical notes recorded by medical personnel can also be a potential source of rich patient-specific information which can be leveraged to build CDSSs, even for hospitals in developing countries. If such unstructured clinical text can be used, the manual and time-consuming process of EHR generation will no longer be required, with huge person-hours and cost savings. In this paper, we propose a generic ICD9 disease group prediction CDSS built on unstructured physician notes modeled using hybrid word embeddings. These word embeddings are used to train a deep neural network for effectively predicting ICD9 disease groups. Experimental evaluation showed that the proposed approach outperformed the state-of-the-art disease group prediction model built on structured EHRs by 15% in terms of AUROC and 40% in terms of AUPRC, thus proving our hypothesis and eliminating dependency on availability of structured patient data.
翻译:现有的临床决策支持系统(CDSS)主要取决于是否有结构化病人数据和电子健康记录(EHRs)来帮助护理人员,然而,对于发展中国家的医院来说,没有广泛采用结构化病人数据格式,医疗专业人员仍然依赖非结构化文本形式的临床说明,医疗专业人员仍然依赖非结构化文本形式的临床说明。医务人员记录的这种未经结构化的临床说明也可以成为丰富的病人专用信息的潜在来源,可用来建立CDSS,即使是发展中国家的医院也是如此。如果可以使用这种结构化的临床文本,将不再需要EHR一代人的人工和耗时过程,同时要大量人小时和节省费用。我们在本文件中提议,在采用混合词嵌入模式的无结构化医生说明的基础上,建立通用的ICD9疾病组预测CDSS。这些词嵌入式用于训练一个深度的神经网络,以有效预测ICD9疾病组。实验性评价表明,拟议的方法比以结构化的EHR结构化的15 % 的AUROC和40 % 的病人依赖性假设,从而证明我们AURC的可靠程度。