Summarization is a widespread method for handling very large graphs. The task of structural graph summarization is to compute a concise but meaningful synopsis of the key structural information of a graph. As summaries may be used for many different purposes, there is no single concept or model of graph summaries. We have studied existing structural graph summaries for large-scale (semantic) graphs. Despite their different concepts and purposes, we found commonalities in the graph structures they capture. We use these commonalities to provide for the first time a formally defined common model, FLUID (FLexible graph sUmmarIes for Data graphs), that allows us to flexibly define structural graph summaries. FLUID allows graph summaries to be quickly defined, adapted, and compared for different purposes and datasets. To this end, FLUID provides features of structural summarization based on equivalence relations such as distinction of types and properties, direction of edges, bisimulation, and inference. We conduct a detailed complexity analysis of the features provided by FLUID. We show that graph summaries defined with FLUID can be computed in the worst case in time $\mathcal{O}(n^2)$ w.r.t. $n$, the number of edges in the data graph. An empirical analysis of large-scale web graphs with billions of edges indicates a typical running time of $\Theta(n)$. Based on the formal FLUID model, one can quickly define and modify various structural graph summaries from the literature and beyond.
翻译:缩略图是一种处理大图的广泛方法。 结构图形总和的任务是对图表的关键结构信息进行简单而有意义的简要的计算。 由于摘要可能用于许多不同的目的, 没有图形摘要的单一概念或模型。 我们研究了大规模( semantic) 图形的现有结构图形摘要。 尽管这些结构的概念和目的不同, 我们在图的结构结构中发现了共性。 我们第一次使用这些共性来提供一个正式定义的通用模型, FLUID( 数据图表的滚动图形 sUmmarIes), 使我们能够灵活地定义结构图表摘要。 FLUID 允许为不同的目的和数据集快速定义、调整和比较图形摘要。 为此, FLUID 提供了基于等等类型和属性区别、 边缘方向、 振动和推论等等等结构对等特性的共性。 我们对FLUID 提供的特征进行详细的复杂性分析。 我们显示, FLUID 定义的图表摘要可以快速地定义结构图表摘要, 用于最差的 美元 平面 。