BPP: 用于高效磁盘处理的大型图表存储 (BPP: Large Graph Storage for Efficient Disk Based Processing) - 专知论文

会员服务 ·

0

GraphChi · Processing（编程语言） · 图 · Storage · 簇 ·

2014 年 1 月 10 日

BPP: Large Graph Storage for Efficient Disk Based Processing

翻译：BPP: 用于高效磁盘处理的大型图表存储

Kamran Najeebullah,Kifayat Ullah Khan,Waqas Nawaz,Young-Koo Lee

from arxiv, 5 pages, Published in ICCA, 2013

Processing very large graphs like social networks, biological and chemical compounds is a challenging task. Distributed graph processing systems process the billion-scale graphs efficiently but incur overheads of efficient partitioning and distribution of the graph over a cluster of nodes. Distributed processing also requires cluster management and fault tolerance. In order to overcome these problems GraphChi was proposed recently. GraphChi significantly outperformed all the representative distributed processing frameworks. Still, we observe that GraphChi incurs some serious degradation in performance due to 1) high number of non-sequential I/Os for processing every chunk of graph; and 2) lack of true parallelism to process the graph. In this paper we propose a simple yet powerful engine BiShard Parallel Processor (BPP) to efficiently process billions-scale graphs on a single PC. We extend the storage structure proposed by GraphChi and introduce a new processing model called BiShard Parallel (BP). BP enables full CPU parallelism for processing the graph and significantly reduces the number of non-sequential I/Os required to process every chunk of the graph. Our experiments on real large graphs show that our solution significantly outperforms GraphChi.

翻译：处理社交网络、生物和化学化合物等大型图表是一项艰巨的任务。分布式图表处理系统高效地处理十亿比例图,但对于在一组节点上有效分割和分布图却产生间接的间接费用。分布式处理也需要群集管理和差错容忍度。为了解决这些问题, 最近提出了GreatChi 。图形化精度大大优于所有代表性分布式处理框架。我们还观察到, 图形化精度在性能方面造成了一些严重退化, 原因是:(1) 处理每一块图所需的非序列 I/ O 数量很多;和(2) 缺乏处理图所需的真实的平行性。在本文中, 我们提议一个简单而有力的引擎 BiShard平行处理器(BPPP) 来高效处理单一电脑上的数十亿比例图。我们扩展了GreaphChi 提议的存储结构, 并引入了名为 Bishard 平行处理的新处理模型。 BPP为处理图表提供了完全的CPU平行性, 并大大减少了处理每一块图所需的非序列 I/ O 数量。我们在实际大图表上的实验显示我们的解决方案。

0

相关内容

GraphChi

Graphchi是由CMU（卡内基梅隆大学）博士Aapo Kyrola开发的一套基于磁盘的图处理系统，该系统声称能有效处理边数目达数十亿规模的数据图。

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

105+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

238+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

【SIGMOD2020】稀疏数据半监督学习的分解图表示，Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

【SIGMOD2020】稀疏数据半监督学习的分解图表示，Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

专知会员服务

14+阅读 · 2020年3月7日

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

专知会员服务

116+阅读 · 2019年12月31日

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

专知会员服务

45+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

30+阅读 · 2019年10月17日

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems

Arxiv

7+阅读 · 2020年3月12日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Text Level Graph Neural Network for Text Classification

Text Level Graph Neural Network for Text Classification

Arxiv

8+阅读 · 2019年10月6日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

8+阅读 · 2019年5月20日

Learning Discrete Structures for Graph Neural Networks

Arxiv

6+阅读 · 2019年5月17日

Efficient Parameter-free Clustering Using First Neighbor Relations

Efficient Parameter-free Clustering Using First Neighbor Relations

Arxiv

7+阅读 · 2019年2月28日

NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding

NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding

Arxiv

7+阅读 · 2019年1月18日

Efficient end-to-end learning for quantizable representations

Arxiv

4+阅读 · 2018年6月12日

Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing

Arxiv

11+阅读 · 2018年2月16日

Efficient Parallel Translating Embedding For Knowledge Graphs

Arxiv

9+阅读 · 2018年1月9日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

105+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

238+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

【SIGMOD2020】稀疏数据半监督学习的分解图表示，Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

【SIGMOD2020】稀疏数据半监督学习的分解图表示，Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

专知会员服务

14+阅读 · 2020年3月7日

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

专知会员服务

116+阅读 · 2019年12月31日

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

专知会员服务

45+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

30+阅读 · 2019年10月17日

热门VIP内容

相关资讯

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems

Arxiv

7+阅读 · 2020年3月12日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Text Level Graph Neural Network for Text Classification

Text Level Graph Neural Network for Text Classification

Arxiv

8+阅读 · 2019年10月6日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

8+阅读 · 2019年5月20日

Learning Discrete Structures for Graph Neural Networks

Arxiv

6+阅读 · 2019年5月17日

Efficient Parameter-free Clustering Using First Neighbor Relations

Efficient Parameter-free Clustering Using First Neighbor Relations

Arxiv

7+阅读 · 2019年2月28日

NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding

NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding

Arxiv

7+阅读 · 2019年1月18日

Efficient end-to-end learning for quantizable representations

Arxiv

4+阅读 · 2018年6月12日

Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing

Arxiv

11+阅读 · 2018年2月16日

Efficient Parallel Translating Embedding For Knowledge Graphs

Arxiv

9+阅读 · 2018年1月9日

微信扫码咨询专知VIP会员