Cyanobacterial Harmful Algal Blooms (CyanoHABs) pose significant threats to aquatic ecosystems and public health globally. Lake Champlain is particularly vulnerable to recurring CyanoHAB events, especially in its northern segment: Missisquoi Bay, St. Albans Bay, and Northeast Arm, due to nutrient enrichment and climatic variability. Remote sensing provides a scalable solution for monitoring and forecasting these events, offering continuous coverage where in situ observations are sparse or unavailable. In this study, we present a remote sensing only forecasting framework that combines Transformers and BiLSTM to predict CyanoHAB intensities up to 14 days in advance. The system utilizes Cyanobacterial Index data from the Cyanobacterial Assessment Network and temperature data from Moderate Resolution Imaging Spectroradiometer satellites to capture long range dependencies and sequential dynamics in satellite time series. The dataset is very sparse, missing more than 30% of the Cyanobacterial Index data and 90% of the temperature data. A two stage preprocessing pipeline addressed data gaps by applying forward fill and weighted temporal imputation at the pixel level, followed by smoothing to reduce the discontinuities of CyanoHAB events. The raw dataset is transformed into meaningful features through equal frequency binning for the Cyanobacterial Index values and extracted temperature statistics. Transformer BiLSTM model demonstrates strong forecasting performance across multiple horizons, achieving F1 scores of 89.5%, 86.4%, and 85.5% at one, two, and three-day forecasts, respectively, and maintaining an F1 score of 78.9% with an AUC of 82.6% at the 14-day horizon. These results confirm the model's ability to capture complex spatiotemporal dynamics from sparse satellite data and to provide reliable early warning for CyanoHABs management.
翻译:蓝藻有害藻华(CyanoHABs)对全球水生生态系统和公共健康构成重大威胁。由于营养盐富集和气候变异性,尚普兰湖(特别是其北部区域:密西西奎湾、圣奥尔本斯湾和东北臂)极易遭受周期性蓝藻水华事件影响。遥感技术为监测和预测这些事件提供了可扩展的解决方案,在实地观测数据稀疏或缺失的区域实现连续覆盖。本研究提出了一种纯遥感预测框架,结合Transformer与BiLSTM模型,实现提前14天的蓝藻水华强度预测。该系统利用蓝藻评估网络的蓝藻指数数据和中分辨率成像光谱仪卫星的温度数据,捕捉卫星时间序列中的长程依赖关系与序列动态特征。数据集存在高度稀疏性,蓝藻指数数据缺失超过30%,温度数据缺失达90%。通过两阶段预处理流程,在像素级采用前向填充与加权时间插值填补数据空缺,并经过平滑处理以降低蓝藻水华事件的不连续性。原始数据集通过蓝藻指数值的等频分箱处理和温度统计特征提取,转化为有效特征。Transformer-BiLSTM模型在多时间尺度预测中表现出色,在1天、2天和3天预测期的F1分数分别达到89.5%、86.4%和85.5%,在14天预测期仍保持78.9%的F1分数和82.6%的AUC值。这些结果证实了该模型能够从稀疏卫星数据中捕捉复杂的时空动态特征,为蓝藻水华管理提供可靠的早期预警。