硬件和数据结构解决方案的高效建议引文 (MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions) - 专知论文

会员服务 ·

0

推断 · 可约的 · FPGA · Engineering · 正则化项 ·

2021 年 2 月 19 日

MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions

翻译：硬件和数据结构解决方案的高效建议引文

Wenqi Jiang,Zhenhao He,Shuai Zhang,Thomas B. Preußer,Kai Zeng,Liang Feng,Jiansong Zhang,Tongxuan Liu,Yong Li,Jingren Zhou,Ce Zhang,Gustavo Alonso

from arxiv, Accepted by MLSys'21 (the 4th Conference on Machine Learning and Systems)

Deep neural networks are widely used in personalized recommendation systems. Unlike regular DNN inference workloads, recommendation inference is memory-bound due to the many random memory accesses needed to lookup the embedding tables. The inference is also heavily constrained in terms of latency because producing a recommendation for a user must be done in about tens of milliseconds. In this paper, we propose MicroRec, a high-performance inference engine for recommendation systems. MicroRec accelerates recommendation inference by (1) redesigning the data structures involved in the embeddings to reduce the number of lookups needed and (2) taking advantage of the availability of High-Bandwidth Memory (HBM) in FPGA accelerators to tackle the latency by enabling parallel lookups. We have implemented the resulting design on an FPGA board including the embedding lookup step as well as the complete inference process. Compared to the optimized CPU baseline (16 vCPU, AVX2-enabled), MicroRec achieves 13.8~14.7x speedup on embedding lookup alone and 2.5$~5.4x speedup for the entire recommendation inference in terms of throughput. As for latency, CPU-based engines needs milliseconds for inferring a recommendation while MicroRec only takes microseconds, a significant advantage in real-time recommendation systems.

翻译：在个人化建议系统中广泛使用深心血管网络。与正常的 DNN 推断工作量不同,建议推导值与正常的 DNN 推断值不同,建议推导值具有内存性,因为要查看嵌入表需要许多随机的内存存存权限,因此建议值在延缓度方面也受到很大限制,因为为用户提出建议必须在大约几十毫秒内完成。在本文中,我们提议MicroRec,这是建议系统的一种高性能推导引擎。微Rec加速建议引文,办法是(1)重新设计嵌入中的数据结构,以减少所需的查勘次数,(2)利用FPGA 中高宽线内存(HBM)的可用性,通过平行查勘,处理延缓度问题。我们已经在FPGA 板上实施了相应的设计,包括嵌入式查取步骤以及完整的推导过程。与优化的CPU基线(16 VCPU, AVX2-C) 相比,微后加参考,微Rec 系统实现了安装全13.8~14.7x(H) 高级嵌入系统,同时查看2.5~5摩车建议。

0

相关内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

最新「因果推断Causal Inference」综述论文38页pdf，Buffalo、Georgia、阿里巴巴、Virginia

专知会员服务

183+阅读 · 2020年2月11日

自动结构变分推理，Automatic structured variational inference

自动结构变分推理，Automatic structured variational inference

专知会员服务

40+阅读 · 2020年2月10日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

183+阅读 · 2020年2月1日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【Science论文】基于波的物理现象作为一种模拟递归神经网络（Wave physics as an analog recurrent neural network）

【Science论文】基于波的物理现象作为一种模拟递归神经网络（Wave physics as an analog recurrent neural network）

专知会员服务

12+阅读 · 2020年1月3日

【2020年AI趋势摘要：可嵌入、可迁移、可评价】《A Distilled List of AI Trends For 2020 - Towards Data Science》by Roberto Sannazzaro

【2020年AI趋势摘要：可嵌入、可迁移、可评价】《A Distilled List of AI Trends For 2020 - Towards Data Science》by Roberto Sannazzaro

专知会员服务

14+阅读 · 2019年12月20日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：位置感知的长序列会话推荐

LibRec 精选：位置感知的长序列会话推荐

LibRec智能推荐

3+阅读 · 2019年5月17日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

17+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

专知

7+阅读 · 2018年4月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

Particle RAIM

Arxiv

0+阅读 · 2021年4月13日

Group Recommendation Techniques for Feature Modeling and Configuration

Arxiv

0+阅读 · 2021年4月13日

High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models

Arxiv

0+阅读 · 2021年4月12日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Recommendation Systems for Tourism Based on Social Networks: A Survey

Recommendation Systems for Tourism Based on Social Networks: A Survey

Arxiv

3+阅读 · 2019年3月28日

HAQ: Hardware-Aware Automated Quantization

HAQ: Hardware-Aware Automated Quantization

Arxiv

6+阅读 · 2018年11月21日

Attention-based Group Recommendation

Arxiv

14+阅读 · 2018年4月18日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

Towards Efficient Dynamic Virtual Network Embedding Strategy for Cloud IoT Networks

Arxiv

4+阅读 · 2018年1月30日

Learning to Speed Up Query Planning in Graph Databases

Arxiv

6+阅读 · 2018年1月21日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

最新「因果推断Causal Inference」综述论文38页pdf，Buffalo、Georgia、阿里巴巴、Virginia

专知会员服务

183+阅读 · 2020年2月11日

自动结构变分推理，Automatic structured variational inference

自动结构变分推理，Automatic structured variational inference

专知会员服务

40+阅读 · 2020年2月10日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

183+阅读 · 2020年2月1日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【Science论文】基于波的物理现象作为一种模拟递归神经网络（Wave physics as an analog recurrent neural network）

【Science论文】基于波的物理现象作为一种模拟递归神经网络（Wave physics as an analog recurrent neural network）

专知会员服务

12+阅读 · 2020年1月3日

【2020年AI趋势摘要：可嵌入、可迁移、可评价】《A Distilled List of AI Trends For 2020 - Towards Data Science》by Roberto Sannazzaro

【2020年AI趋势摘要：可嵌入、可迁移、可评价】《A Distilled List of AI Trends For 2020 - Towards Data Science》by Roberto Sannazzaro

专知会员服务

14+阅读 · 2019年12月20日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

无人艇集群路径规划研究综述: 深度强化学习

【WWW2025教程】人工智能在复杂网络中的应用：潜力、方法与应用

【ICML2025】利用多样本推理优化语言模型的温度参数

【NTU博士论文】让语言模型更接近人类学习者

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：位置感知的长序列会话推荐

LibRec 精选：位置感知的长序列会话推荐

LibRec智能推荐

3+阅读 · 2019年5月17日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

17+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

专知

7+阅读 · 2018年4月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

Particle RAIM

Arxiv

0+阅读 · 2021年4月13日

Group Recommendation Techniques for Feature Modeling and Configuration

Arxiv

0+阅读 · 2021年4月13日

High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models

Arxiv

0+阅读 · 2021年4月12日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Recommendation Systems for Tourism Based on Social Networks: A Survey

Recommendation Systems for Tourism Based on Social Networks: A Survey

Arxiv

3+阅读 · 2019年3月28日

HAQ: Hardware-Aware Automated Quantization

HAQ: Hardware-Aware Automated Quantization

Arxiv

6+阅读 · 2018年11月21日

Attention-based Group Recommendation

Arxiv

14+阅读 · 2018年4月18日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

Towards Efficient Dynamic Virtual Network Embedding Strategy for Cloud IoT Networks

Arxiv

4+阅读 · 2018年1月30日

Learning to Speed Up Query Planning in Graph Databases

Arxiv

6+阅读 · 2018年1月21日

微信扫码咨询专知VIP会员