配有预测的加增:努力建立一个高效和有效的数据加增模型,用于蒸馏</s> (Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation) - 专知论文

会员服务 ·

0

数据增强 · 蒸馏 · Projection · 词元分析器 · MoDELS ·

2023 年 3 月 11 日

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation

翻译：配有预测的加增:努力建立一个高效和有效的数据加增模型,用于蒸馏

Ziqi Wang,Yuexin Wu,Frederick Liu,Daogao Liu,Le Hou,Hongkun Yu,Jing Li,Heng Ji

from arxiv, 20 pages, 5 figures. Accepted by ICLR 2023

Knowledge distillation is one of the primary methods of transferring knowledge from large to small models. However, it requires massive task-specific data, which may not be plausible in many real-world applications. Data augmentation methods such as representation interpolation, token replacement, or augmentation with models are applied to tackle this problem. However, these data augmentation methods either potentially cause shifts in decision boundaries (representation interpolation), are not expressive enough (token replacement), or introduce too much computational overhead (augmentation with models). To this end, we propose AugPro (Augmentation with Projection), an effective and efficient data augmentation method for distillation. Our method builds on top of representation interpolation augmentation methods to maintain the diversity of expressions and converts the augmented data to tokens to avoid shifting decision boundaries. It uses simple operations that come with little computational overhead. The results on multiple GLUE tasks show that our methods can improve distillation performance by a large margin at a low time cost. Codes are available at https://github.com/google-research/google-research/tree/master/augpro.

翻译：知识蒸馏是将知识从大模型向小模型转移的主要方法之一。然而,它需要大量的任务特定数据,这在许多现实应用中可能不可信。数据增强方法,如代表内插、代用或增加模型等,用于解决这一问题。然而,这些数据增强方法要么可能造成决策界限的变化(代用内插),要么不够清晰(代用替换),或者引入过多的计算间接费用(与模型加压)。为此,我们提议采用AugPro(用投影加增),一种有效的数据增强方法,用于蒸馏。我们的方法建立在代表内插增强方法的顶端,以维持表达形式的多样性,并将扩大的数据转换为符号,以避免改变决定界限。它使用简单的操作,而很少计算间接。多个GLUE任务的结果显示,我们的方法可以以低成本以大利润提高蒸馏性。代码可在https://github.com/google-resear/gogle-reear/tree/maskate/aproppropro 上查阅。</s>

0

相关内容

数据增强

数据增强在机器学习领域多指采用一些方法（比如数据蒸馏，正负样本均衡等）来提高模型数据集的质量，增强数据。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

CuO(Cu2O)-ZnO-Ag纳米线中的等离激元能量转移增强光电转换研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cofilin在Erucin诱导的乳腺癌细胞线粒体分裂和细胞凋亡中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

EV71病毒非结构蛋白2A的功能及作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

胶质瘤中多途径介导的miR-128调控HIF-1信号通路的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

定向迁移细胞前沿整合素α4调控Rac局部化激活的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

能量级联定向纳米有机复合薄膜的制备及其光伏性能的研究

国家自然科学基金

0+阅读 · 2009年12月31日

DegP (HtrA)的蛋白酶与分子伴侣活性之间功能转变的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Langmuir环流在上层海洋混合中的作用

国家自然科学基金

0+阅读 · 2008年12月31日

溶液加工型高迁移率有机/高分子半导体材料的设计、合成与性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

Multimodal Data Augmentation for Image Captioning using Diffusion Models

Arxiv

0+阅读 · 2023年5月3日

Efficient Activation Function Optimization through Surrogate Modeling

Arxiv

0+阅读 · 2023年5月2日

ARBEx: Attentive Feature Extraction with Reliability Balancing for Robust Facial Expression Learning

Arxiv

0+阅读 · 2023年5月2日

Bidirectional Propagation for Cross-Modal 3D Object Detection

Arxiv

0+阅读 · 2023年5月2日

Multi-to-Single Knowledge Distillation for Point Cloud Semantic Segmentation

Arxiv

1+阅读 · 2023年4月28日

Data Augmentation for Graph Neural Networks

Arxiv

38+阅读 · 2020年12月2日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

VIP会员

文章信息

相关主题

词元分析器

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Multimodal Data Augmentation for Image Captioning using Diffusion Models

Arxiv

0+阅读 · 2023年5月3日

Efficient Activation Function Optimization through Surrogate Modeling

Arxiv

0+阅读 · 2023年5月2日

ARBEx: Attentive Feature Extraction with Reliability Balancing for Robust Facial Expression Learning

Arxiv

0+阅读 · 2023年5月2日

Bidirectional Propagation for Cross-Modal 3D Object Detection

Arxiv

0+阅读 · 2023年5月2日

Multi-to-Single Knowledge Distillation for Point Cloud Semantic Segmentation

Arxiv

1+阅读 · 2023年4月28日

Data Augmentation for Graph Neural Networks

Arxiv

38+阅读 · 2020年12月2日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

相关基金

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

CuO(Cu2O)-ZnO-Ag纳米线中的等离激元能量转移增强光电转换研究

国家自然科学基金

0+阅读 · 2013年12月31日

Cofilin在Erucin诱导的乳腺癌细胞线粒体分裂和细胞凋亡中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

EV71病毒非结构蛋白2A的功能及作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

胶质瘤中多途径介导的miR-128调控HIF-1信号通路的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

定向迁移细胞前沿整合素α4调控Rac局部化激活的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

能量级联定向纳米有机复合薄膜的制备及其光伏性能的研究

国家自然科学基金

0+阅读 · 2009年12月31日

DegP (HtrA)的蛋白酶与分子伴侣活性之间功能转变的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Langmuir环流在上层海洋混合中的作用

国家自然科学基金

0+阅读 · 2008年12月31日

溶液加工型高迁移率有机/高分子半导体材料的设计、合成与性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员