For semantic segmentation of remote sensing images (RSI), trade-off between representation power and location accuracy is quite important. How to get the trade-off effectively is an open question,where current approaches of utilizing very deep models result in complex models with large memory consumption. In contrast to previous work that utilizes dilated convolutions or deep models, we propose a novel two-stream deep neural network for semantic segmentation of RSI (RSI-Net) to obtain improved performance through modeling and propagating spatial contextual structure effectively and a decoding scheme with image-level and graph-level combination. The first component explicitly models correlations between adjacent land covers and conduct flexible convolution on arbitrarily irregular image regions by using graph convolutional network, while densely connected atrous convolution network (DenseAtrousCNet) with multi-scale atrous convolution can expand the receptive fields and obtain image global information. Extensive experiments are implemented on the Vaihingen, Potsdam and Gaofen RSI datasets, where the comparison results demonstrate the superior performance of RSI-Net in terms of overall accuracy (91.83%, 93.31% and 93.67% on three datasets, respectively), F1 score (90.3%, 91.49% and 89.35% on three datasets, respectively) and kappa coefficient (89.46%, 90.46% and 90.37% on three datasets, respectively) when compared with six state-of-the-art RSI semantic segmentation methods.
翻译:91. 关于遥感图像的语义分解(RSI),代表力和位置精确度之间的取舍是相当重要的。如何有效地实现权衡取舍是一个尚未解决的问题,因为当前使用非常深模型的方法导致大量内存消耗的复杂模型。与以往使用放大变异或深模型的工作相比,我们提议建立一个新型的双流深神经网络,用于RSI(RSI-Net)的语义分解,以便通过建模和有效传播空间背景结构以及图像水平和图形水平组合的解码办法,提高性能。第一个组件明确显示相邻土地覆盖的关联性,并通过使用图象革命网络对任意的不正常图像区域进行灵活变换。而与多尺度变动的网络(enseAstructury CNet)可扩大开放的域,并获得图像全球信息。在Vaihingingen、Potssecam和Gaofen RSI数据集中,比较结果显示RSI-Net在总体精确度(分别为93.3%、93%和3%数据,分别为91%)和3个分级数据(分别为91%和91%)。