Patients increasingly turn to search engines and online content before, or in place of, talking with a health professional. Low quality health information, which is common on the internet, presents risks to the patient in the form of misinformation and a possibly poorer relationship with their physician. To address this, the DISCERN criteria (developed at University of Oxford) are used to evaluate the quality of online health information. However, patients are unlikely to take the time to apply these criteria to the health websites they visit. We built an automated implementation of the DISCERN instrument (Brief version) using machine learning models. We compared the performance of a traditional model (Random Forest) with that of a hierarchical encoder attention-based neural network (HEA) model using two language embeddings, BERT and BioBERT. The HEA BERT and BioBERT models achieved average F1-macro scores across all criteria of 0.75 and 0.74, respectively, outperforming the Random Forest model (average F1-macro = 0.69). Similarly, as measured by F-micro, HEA BERT and BioBERT scored on average 0.80 and 0.81 vs. 0.76 for the Random Forest model. Overall, the neural network based models achieved 81% and 86% average accuracy at 100% and 80% coverage, respectively, compared to 94% manual rating accuracy. The attention mechanism implemented in the HEA architectures provided 'model explainability' by identifying reasonable supporting sentences for the documents fulfilling the Brief DISCERN criteria. Our research suggests that it is feasible to automate online health information quality assessment, which is an important step towards empowering patients to become informed partners in the healthcare process.
翻译:低质量的健康信息在互联网上很常见,给病人带来风险,其形式是错误信息,而且可能与其医生的关系更差。为此,使用DISCERN标准(牛津大学开发)来评估在线健康信息的质量。然而,病人不太可能花时间在他们访问的卫生网站应用这些标准。我们用机器质量模型自动实施DISCERN工具(简便版),我们用机器质量模型来衡量传统模型(兰多森林)的性能,而使用两种语言嵌入的BERT和BioBERERT等标准对病人构成风险。HEA BERT和BioBERT模型分别在所有标准0.75和0.74中达到平均F1-宏观分数,超过了随机森林模型(平均F1-macro=0.69)。同样,根据F-mical、HEABERT和BioBERT等传统模型的性能,平均0.80和0.81个基于高级成本文件的网络,显示我们的标准为80和0.81个标准。