深度群体知识蒸馏 (Deep Collective Knowledge Distillation) - 专知论文

会员服务 ·

0

知识 · 知识蒸馏 · 蒸馏 · ResNet · 关联 ·

2023 年 4 月 18 日

Deep Collective Knowledge Distillation

翻译：深度群体知识蒸馏

Jihyeon Seo,Kyusam Oh,Chanho Min,Yongkeun Yun,Sungwoo Cho

Many existing studies on knowledge distillation have focused on methods in which a student model mimics a teacher model well. Simply imitating the teacher's knowledge, however, is not sufficient for the student to surpass that of the teacher. We explore a method to harness the knowledge of other students to complement the knowledge of the teacher. We propose deep collective knowledge distillation for model compression, called DCKD, which is a method for training student models with rich information to acquire knowledge from not only their teacher model but also other student models. The knowledge collected from several student models consists of a wealth of information about the correlation between classes. Our DCKD considers how to increase the correlation knowledge of classes during training. Our novel method enables us to create better performing student models for collecting knowledge. This simple yet powerful method achieves state-of-the-art performances in many experiments. For example, for ImageNet, ResNet18 trained with DCKD achieves 72.27\%, which outperforms the pretrained ResNet18 by 2.52\%. For CIFAR-100, the student model of ShuffleNetV1 with DCKD achieves 6.55\% higher top-1 accuracy than the pretrained ShuffleNetV1.

翻译：许多现有的知识蒸馏研究专注于学生模型良好地模仿教师模型。然而，简单地模仿教师的知识是不足以让学生超越教师的。我们探索了一种方法来利用其他学生的知识来补充教师的知识。我们提出了一种用于模型压缩的深度集体知识蒸馏（DCKD），它是一种让学生模型通过收集来自教师模型和其他学生模型的丰富信息来获得知识的方法。从多个学生模型收集的知识包含有关类之间关联性的丰富信息。我们的DCKD考虑如何在训练过程中增加类的关联性知识。我们的新方法使我们能够创建更好的学生模型来收集知识。这种简单而强大的方法在许多实验中实现了最先进的性能。例如，对于ImageNet，使用DCKD训练的ResNet18达到了72.27\%，比预训练的ResNet18高出了2.52\%。对于CIFAR-100，使用DCKD的ShuffleNetV1的学生模型比预训练的ShuffleNetV1高出6.55\%的top-1准确率。

0

相关内容

【CVPR2022】基于知识蒸馏的高效预训练

【CVPR2022】基于知识蒸馏的高效预训练

专知会员服务

32+阅读 · 2022年4月23日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【CVPR2022】MSDN: 零样本学习的互语义蒸馏网络

【CVPR2022】MSDN: 零样本学习的互语义蒸馏网络

专知会员服务

21+阅读 · 2022年3月8日

WSDM2022 | DualDE：基于知识图谱蒸馏的低成本推理

WSDM2022 | DualDE：基于知识图谱蒸馏的低成本推理

专知会员服务

19+阅读 · 2022年1月20日

【ICML2021】无训练神经架构搜索

专知会员服务

20+阅读 · 2021年9月16日

《多任务学习》最新综述论文，20页pdf

《多任务学习》最新综述论文，20页pdf

专知会员服务

125+阅读 · 2021年4月6日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

96+阅读 · 2020年3月25日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

COLING 2022 | Pro-KD：循序渐进的平滑知识蒸馏

COLING 2022 | Pro-KD：循序渐进的平滑知识蒸馏

PaperWeekly

1+阅读 · 2022年10月5日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

专知

29+阅读 · 2018年3月6日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

MiR-133互作lncRNAs的鉴定及协同调控牛肌肉发育分化的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

离子液体水溶液的下临界溶解温度相行为研究

国家自然科学基金

0+阅读 · 2013年12月31日

多壁碳纳米管和典型环境污染物对食用植物的联合效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

帕金森病酰胺质子转移磁共振成像研究

国家自然科学基金

0+阅读 · 2013年12月31日

水及水溶液微观结构的拉曼光谱研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

稳健且有效的回归和变量选择方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于人工免疫网络的高光谱遥感影像特征选择与分类

国家自然科学基金

0+阅读 · 2009年12月31日

Sorcin蛋白在胃癌耐药细胞中的相互作用网络研究

国家自然科学基金

0+阅读 · 2008年12月31日

Active Code Learning: Benchmarking Sample-Efficient Training of Code Models

Arxiv

0+阅读 · 2023年6月2日

Adversarial-Aware Deep Learning System based on a Secondary Classical Machine Learning Verification Approach

Arxiv

0+阅读 · 2023年6月1日

Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval

Arxiv

0+阅读 · 2023年5月31日

Efficient Implementation of a Multi-Layer Gradient-Free Online-Trainable Spiking Neural Network on FPGA

Arxiv

0+阅读 · 2023年5月31日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

Knowledge Embedding Based Graph Convolutional Network

Knowledge Embedding Based Graph Convolutional Network

Arxiv

24+阅读 · 2021年4月23日

A Survey on Knowledge Graphs: Representation, Acquisition and Applications

Arxiv

32+阅读 · 2021年1月17日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR2022】基于知识蒸馏的高效预训练

【CVPR2022】基于知识蒸馏的高效预训练

专知会员服务

32+阅读 · 2022年4月23日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【CVPR2022】MSDN: 零样本学习的互语义蒸馏网络

【CVPR2022】MSDN: 零样本学习的互语义蒸馏网络

专知会员服务

21+阅读 · 2022年3月8日

WSDM2022 | DualDE：基于知识图谱蒸馏的低成本推理

WSDM2022 | DualDE：基于知识图谱蒸馏的低成本推理

专知会员服务

19+阅读 · 2022年1月20日

【ICML2021】无训练神经架构搜索

专知会员服务

20+阅读 · 2021年9月16日

《多任务学习》最新综述论文，20页pdf

《多任务学习》最新综述论文，20页pdf

专知会员服务

125+阅读 · 2021年4月6日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

96+阅读 · 2020年3月25日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

COLING 2022 | Pro-KD：循序渐进的平滑知识蒸馏

COLING 2022 | Pro-KD：循序渐进的平滑知识蒸馏

PaperWeekly

1+阅读 · 2022年10月5日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

专知

29+阅读 · 2018年3月6日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

Active Code Learning: Benchmarking Sample-Efficient Training of Code Models

Arxiv

0+阅读 · 2023年6月2日

Adversarial-Aware Deep Learning System based on a Secondary Classical Machine Learning Verification Approach

Arxiv

0+阅读 · 2023年6月1日

Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval

Arxiv

0+阅读 · 2023年5月31日

Efficient Implementation of a Multi-Layer Gradient-Free Online-Trainable Spiking Neural Network on FPGA

Arxiv

0+阅读 · 2023年5月31日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

Knowledge Embedding Based Graph Convolutional Network

Knowledge Embedding Based Graph Convolutional Network

Arxiv

24+阅读 · 2021年4月23日

A Survey on Knowledge Graphs: Representation, Acquisition and Applications

Arxiv

32+阅读 · 2021年1月17日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

MiR-133互作lncRNAs的鉴定及协同调控牛肌肉发育分化的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

离子液体水溶液的下临界溶解温度相行为研究

国家自然科学基金

0+阅读 · 2013年12月31日

多壁碳纳米管和典型环境污染物对食用植物的联合效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

帕金森病酰胺质子转移磁共振成像研究

国家自然科学基金

0+阅读 · 2013年12月31日

水及水溶液微观结构的拉曼光谱研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GPU的directionlets域SAR图像相干斑噪声抑制并行算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

稳健且有效的回归和变量选择方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于人工免疫网络的高光谱遥感影像特征选择与分类

国家自然科学基金

0+阅读 · 2009年12月31日

Sorcin蛋白在胃癌耐药细胞中的相互作用网络研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员