GPU 高效和可缩放的图形模式采矿 (Efficient and Scalable Graph Pattern Mining on GPUs) - 专知论文

会员服务 ·

0

MINE · state-of-the-art · 图 · GPU · CUDA ·

2021 年 12 月 17 日

Efficient and Scalable Graph Pattern Mining on GPUs

翻译：GPU 高效和可缩放的图形模式采矿

Xuhao Chen, Arvind

We describe G2Miner, the first Graph Pattern Mining (GPM) framework that runs on multiple GPUs. G2Miner uses pattern-aware, input-aware and architecture-aware search strategies to achieve high efficiency on GPUs. To simplify programming, it provides a code generator that automatically generates pattern-aware CUDA code. G2Miner flexibly supports both breadth-first search (BFS) and depth-first search (DFS) to maximize memory utilization and generate sufficient parallelism for GPUs. For the scalability of G2Miner, we use a customized scheduling policy to balance work among multiple GPUs. Experiments on a V100 GPU show that G2Miner achieves average speedups of 5.4x and 7.2x over two state-of-the-art single-GPU systems, Pangolin and PBE, respectively. In the multi-GPU setting, G2Miner achieves linear speedups from 1 to 8 GPUs, for various patterns and data graphs. We also show that G2Miner on a V100 GPU is 48.3x and 15.2x faster than the state-of-the-art CPU-based system, Peregrine and GraphZero, on a 56-core CPU machine.

翻译：我们描述在多个 GPU 上运行的第一个 G2Miner 图形模式采矿框架( GPMM ) 。 G2Miner 使用模式智能、输入觉和结构觉搜索策略来实现 GPU 的高效。为了简化程序,它提供了自动生成模式觉悟 CUDA 代码的代码生成器。 G2Miner 灵活地支持宽度第一搜索( BFS) 和深度第一搜索( DFS), 以最大限度地利用存储量和生成对 GPU 的足够平行。对于 G2Miner 的可扩展性, 我们使用定制的时间安排政策来平衡多个 GPU的工作。 V100 GPU 的实验显示, G2Miner 在两种最先进的单级GPU系统(Pangolin 和 PBE) 上平均超速5.4x和7.2x。在多GPU的设置中, G2MER 实现从 1到 8 GPUPU 的线性加速度速度。我们还显示, 在 V100 GPU- PI 系统上G- Z 和15.2- PI- 更快。

0

相关内容

MINE

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

105+阅读 · 2021年10月30日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

【机器学习面试】《Machine Learning Interviews - YouTube》by Huyen Chip [Senior Deep Learning Engineer, NVIDIA]

【机器学习面试】《Machine Learning Interviews - YouTube》by Huyen Chip [Senior Deep Learning Engineer, NVIDIA]

专知会员服务

44+阅读 · 2019年12月24日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

118+阅读 · 2019年12月24日

【Amazon AWS】深度学习编译器（Deep Learning Compiler），附35页ppt

【Amazon AWS】深度学习编译器（Deep Learning Compiler），附35页ppt

专知会员服务

43+阅读 · 2019年11月5日

Natural Language Interface to Knowledge Graph (our experience) ，加州大学圣塔芭芭拉分校严锡峰副教授，CIPS ATT 16（2019）

Natural Language Interface to Knowledge Graph (our experience) ，加州大学圣塔芭芭拉分校严锡峰副教授，CIPS ATT 16（2019）

专知会员服务

16+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

学术会议 | 知识图谱顶会 ISWC 征稿：Poster/Demo

学术会议 | 知识图谱顶会 ISWC 征稿：Poster/Demo

开放知识图谱

5+阅读 · 2019年4月16日

Windows 提权-快速查找 Exp

Windows 提权-快速查找 Exp

黑白之道

3+阅读 · 2019年1月23日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

前端高性能计算（4）：GPU加速计算

前端高性能计算（4）：GPU加速计算

前端大全

7+阅读 · 2017年10月26日

干货｜GPU加速深度学习

干货｜GPU加速深度学习

全球人工智能

5+阅读 · 2017年7月20日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

Efficient Memory Partitioning in Software Defined Hardware

Arxiv

0+阅读 · 2022年2月22日

Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU

Arxiv

0+阅读 · 2022年2月22日

On Efficient Noncommutative Polynomial Factorization via Higman Linearization

Arxiv

0+阅读 · 2022年2月20日

Efficient Non-Sampling Knowledge Graph Embedding

Arxiv

9+阅读 · 2021年4月21日

Dash: Scalable Hashing on Persistent Memory

Arxiv

6+阅读 · 2020年3月16日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

8+阅读 · 2019年5月20日

AceKG: A Large-scale Knowledge Graph for Academic Data Mining

AceKG: A Large-scale Knowledge Graph for Academic Data Mining

Arxiv

6+阅读 · 2018年8月7日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension

Arxiv

3+阅读 · 2018年4月20日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

105+阅读 · 2021年10月30日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

【机器学习面试】《Machine Learning Interviews - YouTube》by Huyen Chip [Senior Deep Learning Engineer, NVIDIA]

【机器学习面试】《Machine Learning Interviews - YouTube》by Huyen Chip [Senior Deep Learning Engineer, NVIDIA]

专知会员服务

44+阅读 · 2019年12月24日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

118+阅读 · 2019年12月24日

【Amazon AWS】深度学习编译器（Deep Learning Compiler），附35页ppt

【Amazon AWS】深度学习编译器（Deep Learning Compiler），附35页ppt

专知会员服务

43+阅读 · 2019年11月5日

Natural Language Interface to Knowledge Graph (our experience) ，加州大学圣塔芭芭拉分校严锡峰副教授，CIPS ATT 16（2019）

Natural Language Interface to Knowledge Graph (our experience) ，加州大学圣塔芭芭拉分校严锡峰副教授，CIPS ATT 16（2019）

专知会员服务

16+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

学术会议 | 知识图谱顶会 ISWC 征稿：Poster/Demo

学术会议 | 知识图谱顶会 ISWC 征稿：Poster/Demo

开放知识图谱

5+阅读 · 2019年4月16日

Windows 提权-快速查找 Exp

Windows 提权-快速查找 Exp

黑白之道

3+阅读 · 2019年1月23日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

前端高性能计算（4）：GPU加速计算

前端高性能计算（4）：GPU加速计算

前端大全

7+阅读 · 2017年10月26日

干货｜GPU加速深度学习

干货｜GPU加速深度学习

全球人工智能

5+阅读 · 2017年7月20日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

相关论文

Efficient Memory Partitioning in Software Defined Hardware

Arxiv

0+阅读 · 2022年2月22日

Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU

Arxiv

0+阅读 · 2022年2月22日

On Efficient Noncommutative Polynomial Factorization via Higman Linearization

Arxiv

0+阅读 · 2022年2月20日

Efficient Non-Sampling Knowledge Graph Embedding

Arxiv

9+阅读 · 2021年4月21日

Dash: Scalable Hashing on Persistent Memory

Arxiv

6+阅读 · 2020年3月16日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

8+阅读 · 2019年5月20日

AceKG: A Large-scale Knowledge Graph for Academic Data Mining

AceKG: A Large-scale Knowledge Graph for Academic Data Mining

Arxiv

6+阅读 · 2018年8月7日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension

Arxiv

3+阅读 · 2018年4月20日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

微信扫码咨询专知VIP会员