SSAG: 属性图的汇总和封闭 (SsAG: Summarization and sparsification of Attributed Graphs)

We present SSAG, an efficient and scalable method for computing a lossy graph summary that retains the essential structure of the original graph. SSAG computes a sparse representation (summary) of the input graph and also caters for graphs with node attributes. The summary of a graph $G$ is stored as a graph on supernodes (subset of vertices of $G$) and two supernodes are connected by a weighted superedge. The proposed method constructs a summary graph on $k$ supernodes that minimizes the reconstruction error (difference between the original graph and the graph reconstructed from the summary) and maximum homogeneity with respect to attribute values. We construct the summary by iteratively merging a pair of nodes. We derive a closed-form expression to efficiently compute the reconstruction error after merging a pair and approximate this score in constant time. To reduce the search space for selecting the best pair for merging, we assign a weight to each supernode that closely quantifies the contribution of the node in the score of the pairs containing it. We choose the best pair for merging from a random sample made up of supernodes selected with probability proportional to their weights. With weighted sampling, a logarithmic-sized sample yields a comparable summary based on various quality measures. We propose a sparsification step for the constructed summary to reduce the storage cost to a given target size with a marginal increase in reconstruction error. Empirical evaluation on several real-world graphs and comparison with state-of-the-art methods shows that SSAG is up to $5\times$ faster and generates summaries of comparable quality. We further demonstrate the goodness of SSAG by accurately and efficiently answering the queries related to the graph structure and attribute information using the summary only.

翻译：我们提出SSSAG, 这是一种高效且可缩放的方法, 用于计算损失图摘要, 以保留原始图表的基本结构。 SSAG 计算输入图的表达式( 总和) 稀少( 总和), 并满足带有节点属性的图形。图形$G$的汇总存储为超点( 双螺旋为$G$的子集) 的图表, 和两个超级节点由加权的上层连接。拟议的方法在 $k$ 的超级节点上构建一个简图, 以最小化重建错误( 原始图表与从摘要中重建的图表之间的差值), 并在属性值值值值方面进行最大同性比较。我们通过迭接合合并一对节点来构建摘要。为了减少选择最佳对配对的搜索空间, 我们给每个超级节点设定了一个小节点( 原始图表和从摘要中重建图表的差值之间的差值), 我们选择了最优的比值, 将精度比值比值比值比值比值比值比值比值排序。我们选择了比值分析了比值质量, 将比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值, 。