We propose a novel approach for visual representation learning called Signature-Graph Neural Networks (SGN). SGN learns latent global structures that augment the feature representation of Convolutional Neural Networks (CNN). SGN constructs unique undirected graphs for each image based on the CNN feature maps. The feature maps are partitioned into a set of equal and non-overlapping patches. The graph nodes are located on high-contrast sharp convolution features with the local maxima or minima in these patches. The node embeddings are aggregated through novel Signature-Graphs based on horizontal and vertical edge connections. The representation vectors are then computed based on the spectral Laplacian eigenvalues of the graphs. SGN outperforms existing methods of recent graph convolutional networks, generative adversarial networks, and auto-encoders with image classification accuracy of 99.65% on ASIRRA, 99.91% on MNIST, 98.55% on Fashion-MNIST, 96.18% on CIFAR-10, 84.71% on CIFAR-100, 94.36% on STL10, and 95.86% on SVHN datasets. We also introduce a novel implementation of the state-of-the-art multi-head attention (MHA) on top of the proposed SGN. Adding SGN to MHA improved the image classification accuracy from 86.92% to 94.36% on the STL10 dataset
翻译:我们建议采用新型的视觉代表学习方法,称为签名-格瑞夫神经网络(SGN) 。 SGN 学习增加革命神经网络特征代表的潜伏全球结构。 SGN 以CNN 特征地图为基础,为每个图像构建独特的非方向图形。 特征地图被分割成一套平等和非重叠的补丁。 图形节点位于这些补丁中高调的尖锐混杂特征上,与当地最高点或小点相连接。 节点嵌入通过基于横向和垂直边缘连接的新颖的签名-格拉夫汇总。 然后,代表矢量根据图表的光谱 Laplaceian 仪形值进行计算。 SGN优于最新图表革命网络的现有方法,配制的对抗网络,以及图像分类准确率为99.65%的ASIRRA,9.91%的MNISTIT,98. 55%的FASTA-MIST, CIFAR-10、84.71% CIFAR-100,94.36 代表矢量矢量矢量值计算。SL-HAHAHA 最新数据实施率的SL10。