碎块: 通过小数位位位边框混合精度量度 (FracBits: Mixed Precision Quantization via Fractional Bit-Widths) - 专知论文

会员服务 ·

0

查准率/准确率 · MoDELS · Better · 可约的 · 约束 ·

2020 年 12 月 3 日

FracBits: Mixed Precision Quantization via Fractional Bit-Widths

翻译：碎块: 通过小数位位位边框混合精度量度

Linjie Yang,Qing Jin

from arxiv, Accepted by AAAI 2021

Model quantization helps to reduce model size and latency of deep neural networks. Mixed precision quantization is favorable with customized hardwares supporting arithmetic operations at multiple bit-widths to achieve maximum efficiency. We propose a novel learning-based algorithm to derive mixed precision models end-to-end under target computation constraints and model sizes. During the optimization, the bit-width of each layer / kernel in the model is at a fractional status of two consecutive bit-widths which can be adjusted gradually. With a differentiable regularization term, the resource constraints can be met during the quantization-aware training which results in an optimized mixed precision model. Further, our method can be naturally combined with channel pruning for better computation cost allocation. Our final models achieve comparable or better performance than previous quantization methods with mixed precision on MobilenetV1/V2, ResNet18 under different resource constraints on ImageNet dataset.

翻译：模型量化有助于降低深神经网络的模型大小和长度。混合精度量化有利于定制硬件, 支持多种位宽的算术操作, 以达到最高效率。我们提出一种新的基于学习的算法, 在目标计算限制和模型大小下得出混合精准模型端对端。在优化过程中, 模型中每个层/ 内核的比特维度处于两个连续的位宽的分位状态, 可以逐步调整。使用不同的正规化条件, 资源限制可以在量化测试培训期间解决, 从而形成最佳混合精确模型。此外, 我们的方法可以自然地与频道运行同步, 以更好地计算成本分配。我们的最终模型在移动netV1/V2、 ResNet18 和图像网络数据集的不同资源限制下, 实现与先前的精度混合精度的四分法相似或更好的性能。

0

相关内容

查准率/准确率

查准率/准确率

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

专知会员服务

17+阅读 · 2020年8月23日

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

专知会员服务

26+阅读 · 2020年7月19日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

专知会员服务

24+阅读 · 2020年4月13日

【WF-IoT-普渡大学】低功耗深度学习和计算机视觉方法综述

专知会员服务

45+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

专知会员服务

36+阅读 · 2019年11月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

已删除

将门创投

4+阅读 · 2018年6月4日

Hessian-Aware Pruning and Optimal Neural Implant

Arxiv

0+阅读 · 2021年1月22日

ItNet: iterative neural networks with tiny graphs for accurate and efficient anytime prediction

Arxiv

0+阅读 · 2021年1月21日

Categorical Normalizing Flows via Continuous Transformations

Arxiv

1+阅读 · 2021年1月21日

A Fast Optimal Double Row Legalization Algorithm

Arxiv

0+阅读 · 2021年1月21日

Channel Estimation and Equalization for CP-OFDM-based OTFS in Fractional Doppler Channels

Arxiv

0+阅读 · 2021年1月21日

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Arxiv

8+阅读 · 2020年10月9日

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Arxiv

4+阅读 · 2019年5月9日

HAQ: Hardware-Aware Automated Quantization

HAQ: Hardware-Aware Automated Quantization

Arxiv

6+阅读 · 2018年11月21日

MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

Arxiv

4+阅读 · 2018年4月22日

Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy

Arxiv

4+阅读 · 2018年4月14日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

专知会员服务

17+阅读 · 2020年8月23日

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

专知会员服务

26+阅读 · 2020年7月19日

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

【KDD2020】动态图的拉普拉斯变换点检测，Laplacian Change Point Detection for Dynamic Graphs

专知会员服务

38+阅读 · 2020年7月3日

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

专知会员服务

24+阅读 · 2020年4月13日

【WF-IoT-普渡大学】低功耗深度学习和计算机视觉方法综述

专知会员服务

45+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

专知会员服务

36+阅读 · 2019年11月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML2025】学习最优多模态信息瓶颈表示

多模态对话情感识别：方法、趋势、挑战与前景综述

DeepSeek技术溯源及前沿探索

【CMU博士论文】迈向机器学习解微分方程的理论与实证基础

相关资讯

已删除

将门创投

4+阅读 · 2018年6月4日

相关论文

Hessian-Aware Pruning and Optimal Neural Implant

Arxiv

0+阅读 · 2021年1月22日

ItNet: iterative neural networks with tiny graphs for accurate and efficient anytime prediction

Arxiv

0+阅读 · 2021年1月21日

Categorical Normalizing Flows via Continuous Transformations

Arxiv

1+阅读 · 2021年1月21日

A Fast Optimal Double Row Legalization Algorithm

Arxiv

0+阅读 · 2021年1月21日

Channel Estimation and Equalization for CP-OFDM-based OTFS in Fractional Doppler Channels

Arxiv

0+阅读 · 2021年1月21日

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Arxiv

8+阅读 · 2020年10月9日

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Arxiv

4+阅读 · 2019年5月9日

HAQ: Hardware-Aware Automated Quantization

HAQ: Hardware-Aware Automated Quantization

Arxiv

6+阅读 · 2018年11月21日

MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

Arxiv

4+阅读 · 2018年4月22日

Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy

Arxiv

4+阅读 · 2018年4月14日

微信扫码咨询专知VIP会员