Dengue is a major threat to public health in Brazil, the world's sixth biggest country by population, with over 1.5 million cases recorded in 2019 alone. Official data on dengue case counts is delivered incrementally and, for many reasons, often subject to delays of weeks. In contrast, data on dengue-related Google searches and Twitter messages is available in full with no delay. Here, we describe a model which uses online data to deliver improved weekly estimates of dengue incidence in Rio de Janeiro. We address a key shortcoming of previous online data disease surveillance models by explicitly accounting for the incremental delivery of case count data, to ensure that our approach can be used in practice. We also draw on data from Google Trends and Twitter in tandem, and demonstrate that this leads to slightly better estimates than a model using only one of these data streams alone. Our results provide evidence that online data can be used to improve both the accuracy and precision of rapid estimates of disease incidence, even where the underlying case count data is subject to long and varied delays.
翻译:登革热是巴西公共卫生的一大威胁,巴西是世界上第六大人口国家,仅在2019年就记录了超过150万个病例。登革热病例的官方数据是递增提供的,而且由于许多原因往往拖延数周。相比之下,与登革热有关的谷歌搜索和推特信息数据完全可以立即获得。这里,我们描述了一个模型,利用在线数据来提供里约热内卢每周登革热发病率的更好估计数。我们通过明确计算递增的病例数数据,解决了先前在线数据疾病监测模型的一大缺陷,我们明确计算了过去在线数据疾病监测模型的递增交付情况,以确保我们的做法能够在实践中得到采用。我们还同时从Google Stategs和Twitter中提取数据,并表明这比仅使用其中一种数据流的模型得出了略好一点的估计。我们的结果提供了证据,即在线数据可以用来提高快速估计疾病发病率的准确性和准确性,即使基本案件数数据被长期和多次拖延。