FPGAs 包装式拆散器设计原则 (Design Principles for Packet Deparsers on FPGAs) - 专知论文

会员服务 ·

0

perforce · Principle · 可约的 · Networking · 设计 ·

2021 年 3 月 13 日

Design Principles for Packet Deparsers on FPGAs

翻译：FPGAs 包装式拆散器设计原则

Thomas Luinaud,Jeferson Santiago da Silva,J. M. Pierre Langlois,Yvon Savaria

from arxiv, Presented at ISFPGA'21, 2021 Source code available at : https://github.com/luinaudt/deparser/tree/FPGA_paper

The P4 language has drastically changed the networking field as it allows to quickly describe and implement new networking applications. Although a large variety of applications can be described with the P4 language, current programmable switch architectures impose significant constraints on P4 programs. To address this shortcoming, FPGAs have been explored as potential targets for P4 applications. P4 applications are described using three abstractions: a packet parser, match-action tables, and a packet deparser, which reassembles the output packet with the result of the match-action tables. While implementations of packet parsers and match-action tables on FPGAs have been widely covered in the literature, no general design principles have been presented for the packet deparser. Indeed, implementing a high-speed and efficient deparser on FPGAs remains an open issue because it requires a large amount of interconnections and the architecture must be tailored to a P4 program. As a result, in several works where a P4 application is implemented on FPGAs, the deparser consumes a significant proportion of chip resources. Hence, in this paper, we address this issue by presenting design principles for efficient and high-speed deparsers on FPGAs. As an artifact, we introduce a tool that generates an efficient vendor-agnostic deparser architecture from a P4 program. Our design has been validated and simulated with a cocotb-based framework. The resulting architecture is implemented on Xilinx Ultrascale+ FPGAs and supports a throughput of more than 200 Gbps while reducing resource usage by almost 10$\times$ compared to other solutions.

翻译：P4 语言在快速描述和实施新的网络应用程序时大大改变了网络字段。虽然可以用P4语言描述大量各种应用程序,但目前的可编程开关结构对P4程序施加了重大限制。为解决这一缺陷,已经探索了P4应用程序的潜在目标。P4 应用程序使用三个抽象来描述:一个包包包包、配对操作表和一个包拆解器,它们根据匹配操作表的结果重新组合了美元产出包。尽管对FPGAs的包拆解析器和配对操作表的实施在文献中已广泛覆盖,但对PFPGAs软件的配置却没有提出任何一般性的设计原则。事实上,在FPG应用程序上实施高速高效的拆分解器仍然是个未解决的问题,因为它需要大量的互连和结构必须适应一个P4程序。结果是,在一些工作中,在FP4 基GGGAs 上实施一个P4 应用程序,而DP4 将相当一部分的芯片资源在文献中被广泛使用,因此,我们通过一个高效的SAFA 格式,我们通过一个高效的模型来展示这个工具的版本。

0

相关内容

perforce

perforce

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

专知会员服务

85+阅读 · 2020年5月11日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

105+阅读 · 2020年5月3日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

30+阅读 · 2020年4月1日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

14+阅读 · 2020年3月7日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

58+阅读 · 2019年12月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

90+阅读 · 2019年10月10日

CCF推荐 | 国际会议信息6条

CCF推荐 | 国际会议信息6条

Call4Papers

9+阅读 · 2019年8月13日

人工智能 | CCF推荐期刊专刊约稿信息6条

人工智能 | CCF推荐期刊专刊约稿信息6条

Call4Papers

5+阅读 · 2019年2月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

老铁，邀请你来免费学习人工智能！！！

老铁，邀请你来免费学习人工智能！！！

量化投资与机器学习

4+阅读 · 2017年11月14日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

机器学习研究会

9+阅读 · 2017年10月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

5+阅读 · 2019年5月14日

PFLD: A Practical Facial Landmark Detector

Arxiv

5+阅读 · 2019年2月28日

Deep Structured Prediction with Nonlinear Output Transformations

Arxiv

4+阅读 · 2018年11月1日

Fire SSD: Wide Fire Modules based Single Shot Detector on Edge Device

Arxiv

3+阅读 · 2018年10月16日

Apple Flower Detection using Deep Convolutional Networks

Arxiv

3+阅读 · 2018年9月17日

Acquisition of Localization Confidence for Accurate Object Detection

Acquisition of Localization Confidence for Accurate Object Detection

Arxiv

4+阅读 · 2018年7月30日

The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

Arxiv

5+阅读 · 2018年7月23日

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

Arxiv

4+阅读 · 2018年3月15日

Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation

Arxiv

5+阅读 · 2018年3月13日

VIP会员

文章信息

相关主题

相关VIP内容

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

专知会员服务

85+阅读 · 2020年5月11日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

105+阅读 · 2020年5月3日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

30+阅读 · 2020年4月1日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

14+阅读 · 2020年3月7日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

58+阅读 · 2019年12月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

90+阅读 · 2019年10月10日

热门VIP内容

相关资讯

CCF推荐 | 国际会议信息6条

CCF推荐 | 国际会议信息6条

Call4Papers

9+阅读 · 2019年8月13日

人工智能 | CCF推荐期刊专刊约稿信息6条

人工智能 | CCF推荐期刊专刊约稿信息6条

Call4Papers

5+阅读 · 2019年2月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

老铁，邀请你来免费学习人工智能！！！

老铁，邀请你来免费学习人工智能！！！

量化投资与机器学习

4+阅读 · 2017年11月14日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

机器学习研究会

9+阅读 · 2017年10月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

5+阅读 · 2019年5月14日

PFLD: A Practical Facial Landmark Detector

Arxiv

5+阅读 · 2019年2月28日

Deep Structured Prediction with Nonlinear Output Transformations

Arxiv

4+阅读 · 2018年11月1日

Fire SSD: Wide Fire Modules based Single Shot Detector on Edge Device

Arxiv

3+阅读 · 2018年10月16日

Apple Flower Detection using Deep Convolutional Networks

Arxiv

3+阅读 · 2018年9月17日

Acquisition of Localization Confidence for Accurate Object Detection

Acquisition of Localization Confidence for Accurate Object Detection

Arxiv

4+阅读 · 2018年7月30日

The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

Arxiv

5+阅读 · 2018年7月23日

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

Arxiv

4+阅读 · 2018年3月15日

Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation

Arxiv

5+阅读 · 2018年3月13日

微信扫码咨询专知VIP会员