Random walks are a primary means for extracting information from large-scale graphs. While most real-world graphs are inherently dynamic, state-of-the-art random walk engines failed to efficiently support such a critical use case. This paper takes the initiative to build a general random walk engine for dynamically changing graphs with two key principles: (i) This system should support both low-latency streaming updates and high-throughput batched updates. (ii) This system should achieve fast sampling speed while maintaining acceptable space consumption to support dynamic graph updates. Upholding both standards, we introduce Bingo, a GPU-based random walk engine for dynamically changing graphs. First, we propose a novel radix-based bias factorization algorithm to support constant time sampling complexity while supporting fast streaming updates. Second, we present a group-adaption design to reduce space consumption dramatically. Third, we incorporate GPU-aware designs to support high-throughput batched graph updates on massively parallel platforms. Together, Bingo outperforms existing efforts across various applications, settings, and datasets, achieving up to a 271.11x speedup compared to the state-of-the-art efforts.
翻译:随机游走是从大规模图中提取信息的主要手段。尽管大多数现实世界中的图本质上是动态的,但现有的先进随机游走引擎未能有效支持这一关键应用场景。本文率先构建了一个适用于动态变化图的通用随机游走引擎,遵循两个核心原则:(一)该系统应同时支持低延迟流式更新与高吞吐量批量更新;(二)该系统应在维持可接受空间开销以支持动态图更新的同时,实现快速的采样速度。基于这两项标准,我们提出了Bingo——一个基于GPU的动态变化图随机游走引擎。首先,我们提出了一种新颖的基于基数的偏置分解算法,在支持快速流式更新的同时实现常数时间采样复杂度。其次,我们提出了一种分组自适应设计以显著降低空间消耗。第三,我们整合了GPU感知设计以支持大规模并行平台上的高吞吐量批量图更新。综合而言,Bingo在各类应用场景、参数设置和数据集上均优于现有方案,相比最先进方法实现了最高达271.11倍的加速比。