Hyperspectral imaging (HSI) analysis faces computational bottlenecks due to massive data volumes that exceed available memory. While foundation models pre-trained on large remote sensing datasets show promise, their learned representations often fail to transfer to domain-specific applications like close-range agricultural monitoring where spectral signatures, spatial scales, and semantic targets differ fundamentally. This report presents Deep Global Clustering (DGC), a conceptual framework for memory-efficient HSI segmentation that learns global clustering structure from local patch observations without pre-training. DGC operates on small patches with overlapping regions to enforce consistency, enabling training in under 30 minutes on consumer hardware while maintaining constant memory usage. On a leaf disease dataset, DGC achieves background-tissue separation (mean IoU 0.925) and demonstrates unsupervised disease detection through navigable semantic granularity. However, the framework suffers from optimization instability rooted in multi-objective loss balancing: meaningful representations emerge rapidly but degrade due to cluster over-merging in feature space. We position this work as intellectual scaffolding - the design philosophy has merit, but stable implementation requires principled approaches to dynamic loss balancing. Code and data are available at https://github.com/b05611038/HSI_global_clustering.
翻译:高光谱成像(HSI)分析因数据量巨大、超出可用内存而面临计算瓶颈。尽管在大型遥感数据集上预训练的基础模型展现出潜力,但其学习到的表征通常难以迁移到领域特定的应用中,例如近距离农业监测——这些场景的光谱特征、空间尺度和语义目标存在根本性差异。本报告提出了深度全局聚类(DGC),一种面向内存高效HSI分割的概念框架,它能够从局部图像块观测中学习全局聚类结构,而无需预训练。DGC在具有重叠区域的小图像块上运行,以强制一致性,从而在消费级硬件上实现30分钟内的训练,同时保持恒定的内存使用。在叶片病害数据集上,DGC实现了背景与组织的分离(平均交并比0.925),并通过可导航的语义粒度展示了无监督病害检测能力。然而,该框架存在源于多目标损失平衡的优化不稳定性:有意义的表征迅速出现,但由于特征空间中聚类过度合并而退化。我们将此项工作定位为一种智力支架——其设计理念具有价值,但稳定的实现需要对动态损失平衡采取原则性的方法。代码与数据可在 https://github.com/b05611038/HSI_global_clustering 获取。