For dementia screening and monitoring, standardized tests play a key role in clinical routine since they aim at minimizing subjectivity by measuring performance on a variety of cognitive tasks. In this paper, we report on a study that consists of a semi-standardized history taking followed by two standardized neuropsychological tests, namely the SKT and the CERAD-NB. The tests include basic tasks such as naming objects, learning word lists, but also widely used tools such as the MMSE. Most of the tasks are performed verbally and should thus be suitable for automated scoring based on transcripts. For the first batch of 30 patients, we analyze the correlation between expert manual evaluations and automatic evaluations based on manual and automatic transcriptions. For both SKT and CERAD-NB, we observe high to perfect correlations using manual transcripts; for certain tasks with lower correlation, the automatic scoring is stricter than the human reference since it is limited to the audio. Using automatic transcriptions, correlations drop as expected and are related to recognition accuracy; however, we still observe high correlations of up to 0.98 (SKT) and 0.85 (CERAD-NB). We show that using word alternatives helps to mitigate recognition errors and subsequently improves correlation with expert scores.
翻译:对于痴呆症筛查和监测,标准化测试在临床常规中发挥着关键作用,因为标准化测试的目的是通过衡量各种认知任务的业绩,最大限度地减少主观性;在本文件中,我们报告一项研究,研究内容包括半标准化历史,然后是两种标准化神经心理测试,即SKT和CERAD-NB。测试包括命名对象、学习单词列表等基本任务,但也包括广泛使用的工具,如MMSE。大多数任务都是口头执行,因此适合根据笔录自动评分。关于第一批30名病人,我们分析了基于手工和自动抄录的专家手工评价和自动评价之间的相互关系。对于SKT和CERAD-NB,我们观察到使用手动笔记本进行高度至完美的相关性;对于某些相关性较低的任务,自动评分比人类参考更加严格,因为它仅限于音频。使用自动抄录、预期的关联性下降和与确认准确性有关;然而,我们仍观察到多达0.98(SKT)和0.85(CERAD-NBB)的高度关联性。我们随后发现,我们发现使用单替代词帮助减轻识别和专家成绩。