Superpixels are higher-order perceptual groups of pixels in an image, often carrying much more information than the raw pixels. There is an inherent relational structure to the relationship among different superpixels of an image such as adjacent superpixels are neighbours of each other. Our interest here is to treat these relative positions of various superpixels as relational information of an image. This relational information can convey higher-order spatial information about the image, such as the relationship between superpixels representing two eyes in an image of a cat. That is, two eyes are placed adjacent to each other in a straight line or the mouth is below the nose. Our motive in this paper is to assist computer vision models, specifically those based on Deep Neural Networks (DNNs), by incorporating this higher-order information from superpixels. We construct a hybrid model that leverages (a) Convolutional Neural Network (CNN) to deal with spatial information in an image and (b) Graph Neural Network (GNN) to deal with relational superpixel information in the image. The proposed model is learned using a generic hybrid loss function. Our experiments are extensive, and we evaluate the predictive performance of our proposed hybrid vision model on seven different image classification datasets from a variety of domains such as digit and object recognition, biometrics, medical imaging. The results demonstrate that the relational superpixel information processed by a GNN can improve the performance of a standard CNN-based vision system.
翻译:超级像素是图像中更高层次的像素感知组, 通常携带的信息比原始像素要多得多。 与相邻的超级像素等图像的不同超级像素之间的关系有着内在的关系结构。 我们在这里的利益是将这些超级像素的相对位置作为图像的相近信息处理。 这种关联信息可以传递图像上更高层次的空间信息, 如在猫的图像中代表两只眼睛的超级像素之间的关系。 也就是说, 两只眼睛被放在直线上或嘴部下。 我们本文的动机是协助计算机视觉模型, 特别是基于深神经网络的模型。 我们在这里的利益是将这些更高层次的超级像素的相对位置作为图像的相近信息。 我们建立一个混合模型, (a) 革命神经网络(CNN) 用来在图像中处理空间信息, (b) 图形神经网络(GNNN) 用来处理图像中关联的超级像像素信息。 我们提议的模型的模型是使用一个不同的混合的模型, 模拟的模型, 用来用来对图像的模型进行我们模型的模型的模型的模型进行模拟化化化化化。 我们的模型的模型的模型的模型的模型的模型, 正在用一个不同的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模拟的模拟的模拟的模拟的模型的模拟的模型的模型的模型, 。 。