稀疏常量矩阵基于公共子表达式的压缩与乘法 (Common Subexpression-based Compression and Multiplication of Sparse Constant Matrices) - 专知论文

会员服务 ·

0

矩阵乘法 · 稀疏 · 加法 · 嵌入式 · 矩阵压缩方法 ·

2023 年 3 月 26 日

Common Subexpression-based Compression and Multiplication of Sparse Constant Matrices

翻译：稀疏常量矩阵基于公共子表达式的压缩与乘法

Emre Bilgili,Arda Yurdakul

In deep learning inference, model parameters are pruned and quantized to reduce the model size. Compression methods and common subexpression (CSE) elimination algorithms are applied on sparse constant matrices to deploy the models on low-cost embedded devices. However, the state-of-the-art CSE elimination methods do not scale well for handling large matrices. They reach hours for extracting CSEs in a $200 \times 200$ matrix while their matrix multiplication algorithms execute longer than the conventional matrix multiplication methods. Besides, there exist no compression methods for matrices utilizing CSEs. As a remedy to this problem, a random search-based algorithm is proposed in this paper to extract CSEs in the column pairs of a constant matrix. It produces an adder tree for a $1000 \times 1000$ matrix in a minute. To compress the adder tree, this paper presents a compression format by extending the Compressed Sparse Row (CSR) to include CSEs. While compression rates of more than $50\%$ can be achieved compared to the original CSR format, simulations for a single-core embedded system show that the matrix multiplication execution time can be reduced by $20\%$.

翻译：在深度学习推理中，将模型参数修剪和量化以减少模型大小。基于稀疏常量矩阵的压缩方法和公共子表达式（CSE）消除算法被应用于低成本嵌入式设备上部署模型。然而，目前最先进的CSE消除方法在处理大型矩阵时缩放能力有限。它们处理$200\times200$矩阵时需要几个小时，而它们的矩阵乘法算法执行时间长于传统的矩阵乘法方法。此外，不存在利用CSE的矩阵压缩方法。为了解决这个问题，本文提出了一种基于随机搜索算法的方法，用于提取常量矩阵的列对中的公共子表达式。它可以在一分钟内为$1000\times1000$矩阵生成加法树。为了压缩加法树，本文提出了一种扩展压缩稀疏行（CSR）以包括CSE的压缩格式。与原始CSR格式相比，可以实现超过$50\%$的压缩率，并且单核嵌入式系统的模拟表明，矩阵乘法执行时间可以减少$20\%$。

0

相关内容

矩阵乘法

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

专知会员服务

35+阅读 · 2020年1月6日

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

专知会员服务

29+阅读 · 2019年12月19日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

7 Papers & Radios | NeurIPS'22获奖论文；英伟达一句话生成3D模型

7 Papers & Radios | NeurIPS'22获奖论文；英伟达一句话生成3D模型

机器之心

0+阅读 · 2022年11月27日

图神经网络库PyTorch geometric

图神经网络库PyTorch geometric

图与推荐

17+阅读 · 2020年3月22日

RL解决'LunarLander-v2' (SOTA)

RL解决'LunarLander-v2' (SOTA)

CreateAMind

62+阅读 · 2019年9月27日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

互信息论文笔记

互信息论文笔记

CreateAMind

23+阅读 · 2018年8月23日

【论文推荐】最新六篇主题模型相关论文—收敛率、大规模、深度主题建模、优化、情绪强度、广义动态主题模型

【论文推荐】最新六篇主题模型相关论文—收敛率、大规模、深度主题建模、优化、情绪强度、广义动态主题模型

专知

11+阅读 · 2018年3月29日

与微分算子相关的加权Hardy型空间实变理论及应用

国家自然科学基金

0+阅读 · 2014年12月31日

求解非线性方程的加速迭代算法

国家自然科学基金

0+阅读 · 2014年12月31日

噪声统计不精确的非线性系统信息融合方法

国家自然科学基金

0+阅读 · 2013年12月31日

时间相关微分方程的高性能并行数值方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于OC-seislet变换的三维叠前复杂地震波场迭代数据插值方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

大规模强耦合非线性多变量系统辨识方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

内反馈式可调谐太赫兹行波管振荡器研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于多核机群的Petri网并行算法的研究与实现

国家自然科学基金

0+阅读 · 2011年12月31日

大规模稀疏代数系统的预条件方法与降阶模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

相关于算子的Orlicz-型函数空间的实变理论

国家自然科学基金

0+阅读 · 2011年12月31日

BERM: Training the Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval

Arxiv

0+阅读 · 2023年5月18日

Unrolled Compressed Blind-Deconvolution

Arxiv

0+阅读 · 2023年5月18日

Numerical solution of the incompressible Navier-Stokes equations for chemical mixers via quantum-inspired Tensor Train Finite Element Method

Arxiv

0+阅读 · 2023年5月18日

Hawkes Process Based on Controlled Differential Equations

Arxiv

0+阅读 · 2023年5月18日

Dynamic Matrix Recovery

Arxiv

0+阅读 · 2023年5月17日

An efficient and accurate implicit DG solver for the incompressible Navier-Stokes equations

Arxiv

0+阅读 · 2023年5月17日

Attentive Q-Matrix Learning for Knowledge Tracing

Arxiv

0+阅读 · 2023年5月17日

GPU-parallelisation of wavelet-based grid adaptation for fast finite volume modelling: application to shallow water flows

Arxiv

0+阅读 · 2023年5月16日

Weight-Inherited Distillation for Task-Agnostic BERT Compression

Arxiv

0+阅读 · 2023年5月16日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

VIP会员

文章信息

相关主题

矩阵压缩方法

相关VIP内容

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

专知会员服务

35+阅读 · 2020年1月6日

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

专知会员服务

29+阅读 · 2019年12月19日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

7 Papers & Radios | NeurIPS'22获奖论文；英伟达一句话生成3D模型

7 Papers & Radios | NeurIPS'22获奖论文；英伟达一句话生成3D模型

机器之心

0+阅读 · 2022年11月27日

图神经网络库PyTorch geometric

图神经网络库PyTorch geometric

图与推荐

17+阅读 · 2020年3月22日

RL解决'LunarLander-v2' (SOTA)

RL解决'LunarLander-v2' (SOTA)

CreateAMind

62+阅读 · 2019年9月27日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

互信息论文笔记

互信息论文笔记

CreateAMind

23+阅读 · 2018年8月23日

【论文推荐】最新六篇主题模型相关论文—收敛率、大规模、深度主题建模、优化、情绪强度、广义动态主题模型

【论文推荐】最新六篇主题模型相关论文—收敛率、大规模、深度主题建模、优化、情绪强度、广义动态主题模型

专知

11+阅读 · 2018年3月29日

相关论文

BERM: Training the Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval

Arxiv

0+阅读 · 2023年5月18日

Unrolled Compressed Blind-Deconvolution

Arxiv

0+阅读 · 2023年5月18日

Numerical solution of the incompressible Navier-Stokes equations for chemical mixers via quantum-inspired Tensor Train Finite Element Method

Arxiv

0+阅读 · 2023年5月18日

Hawkes Process Based on Controlled Differential Equations

Arxiv

0+阅读 · 2023年5月18日

Dynamic Matrix Recovery

Arxiv

0+阅读 · 2023年5月17日

An efficient and accurate implicit DG solver for the incompressible Navier-Stokes equations

Arxiv

0+阅读 · 2023年5月17日

Attentive Q-Matrix Learning for Knowledge Tracing

Arxiv

0+阅读 · 2023年5月17日

GPU-parallelisation of wavelet-based grid adaptation for fast finite volume modelling: application to shallow water flows

Arxiv

0+阅读 · 2023年5月16日

Weight-Inherited Distillation for Task-Agnostic BERT Compression

Arxiv

0+阅读 · 2023年5月16日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

相关基金

与微分算子相关的加权Hardy型空间实变理论及应用

国家自然科学基金

0+阅读 · 2014年12月31日

求解非线性方程的加速迭代算法

国家自然科学基金

0+阅读 · 2014年12月31日

噪声统计不精确的非线性系统信息融合方法

国家自然科学基金

0+阅读 · 2013年12月31日

时间相关微分方程的高性能并行数值方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于OC-seislet变换的三维叠前复杂地震波场迭代数据插值方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

大规模强耦合非线性多变量系统辨识方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

内反馈式可调谐太赫兹行波管振荡器研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于多核机群的Petri网并行算法的研究与实现

国家自然科学基金

0+阅读 · 2011年12月31日

大规模稀疏代数系统的预条件方法与降阶模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

相关于算子的Orlicz-型函数空间的实变理论

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员