Accurate and reliable forecasting models are critical for guiding public health responses and policy decisions during pandemics such as COVID-19. Retrospective evaluation of model performance is essential for improving epidemic forecasting capabilities. In this study, we used COVID-19 wastewater data from CDC's National Wastewater Surveillance System to generate sequential weekly retrospective forecasts for the United States from March 2022 through September 2024, both at the national level and for four major regions (Northeast, Midwest, South, and West). We produced 133 weekly forecasts using 11 models, including ARIMA, generalized additive models (GAM), simple linear regression (SLR), Prophet, and the n-sub-epidemic framework (top-ranked, weighted-ensemble, and unweighted-ensemble variants). Forecast performance was assessed using mean absolute error (MAE), mean squared error (MSE), weighted interval score (WIS), and 95% prediction interval coverage. The n-sub-epidemic unweighted ensembles outperformed all other models at 3-4-week horizons, particularly at the national level and in the Midwest and West. ARIMA and GAM performed best at 1-2-week horizons in most regions, whereas Prophet and SLR consistently underperformed across regions and horizons. These findings highlight the value of region-specific modeling strategies and demonstrate the utility of the n-sub-epidemic framework for real-time outbreak forecasting using wastewater surveillance data.
翻译:在COVID-19等大流行期间,准确可靠的预测模型对于指导公共卫生应对和政策决策至关重要。对模型性能进行回顾性评估是提升疫情预测能力的关键。本研究利用美国疾病控制与预防中心国家废水监测系统的COVID-19废水数据,生成了从2022年3月至2024年9月期间美国全国层面及四大主要区域(东北部、中西部、南部和西部)的连续周度回顾性预测。我们使用11种模型生成了133组周度预测,包括ARIMA、广义可加模型(GAM)、简单线性回归(SLR)、Prophet以及n次亚流行框架(最优排名、加权集成和非加权集成变体)。预测性能通过平均绝对误差(MAE)、均方误差(MSE)、加权区间得分(WIS)和95%预测区间覆盖率进行评估。在3-4周预测范围内,n次亚流行非加权集成模型在全国层面及中西部和西部地区的表现优于所有其他模型。在大多数区域的1-2周预测范围内,ARIMA和GAM表现最佳,而Prophet和SLR在所有区域和预测范围内均持续表现不佳。这些发现凸显了区域特异性建模策略的价值,并证明了n次亚流行框架在利用废水监测数据进行实时疫情预测中的实用性。