「GitHub」知识蒸馏从入门到精通

2019 年 7 月 20 日 专知

【导读】知识蒸馏(Knowledge Distilling)是由大神Geoffrey Hinton、Oriol Vinyals、Jeff Dean在NIPS2015上提出的。作为模型压缩的一种方法,知识蒸馏能够利用已经训练的一个较复杂的模型,来指导一个较轻量的模型训练,从而在减小模型大小和计算资源的同时,尽量保持原始大模型的准确率的方法。随着越来越多的AI算法落地工业界,知识蒸馏在大量工业场景上发光发热。Github上的dkozlov同学,整理了Knowledge Distilling的paper、教程、代码,看完这些资料,你一定有所收获。


Github地址: 

https://github.com/dkozlov/awesome-knowledge-distillation

作者:

dkozlov


【文章列表】

  • Combining labeled and unlabeled data with co-training, A. Blum, T. Mitchell, 1998

  • Model Compression, Rich Caruana, 2006

  • Dark knowledge, Geoffrey Hinton , OriolVinyals & Jeff Dean, 2014

  • Learning with Pseudo-Ensembles, Philip Bachman, Ouais Alsharif, Doina Precup, 2014

  • Distilling the Knowledge in a Neural Network, Hinton, J.Dean, 2015

  • Cross Modal Distillation for Supervision Transfer, Saurabh Gupta, Judy Hoffman, Jitendra Malik, 2015

  • Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization, Baohan Xu, Yanwei Fu, Yu-Gang Jiang, Boyang Li, Leonid Sigal, 2015

  • Distilling Model Knowledge, George Papamakarios, 2015

  • Unifying distillation and privileged information, David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, Vladimir Vapnik, 2015

  • Learning Using Privileged Information: Similarity Control and Knowledge Transfer, Vladimir Vapnik, Rauf Izmailov, 2015

  • Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks, Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, Ananthram Swami, 2016

  • Do deep convolutional nets really need to be deep and convolutional?, Gregor Urban, Krzysztof J. Geras, Samira Ebrahimi Kahou, Ozlem Aslan, Shengjie Wang, Rich Caruana, Abdelrahman Mohamed, Matthai Philipose, Matt Richardson, 2016

  • Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, Sergey Zagoruyko, Nikos Komodakis, 2016

  • FitNets: Hints for Thin Deep Nets, Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio, 2015

  • Deep Model Compression: Distilling Knowledge from Noisy Teachers, Bharat Bhusan Sau, Vineeth N. Balasubramanian, 2016

  • Knowledge Distillation for Small-footprint Highway Networks, Liang Lu, Michelle Guo, Steve Renals, 2016

  • Sequence-Level Knowledge Distillation, deeplearning-papernotes, Yoon Kim, Alexander M. Rush, 2016

  • MobileID: Face Model Compression by Distilling Knowledge from Neurons, Ping Luo, Zhenyao Zhu, Ziwei Liu, Xiaogang Wang and Xiaoou Tang, 2016

  • Recurrent Neural Network Training with Dark Knowledge Transfer, Zhiyuan Tang, Dong Wang, Zhiyong Zhang, 2016

  • Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, Sergey Zagoruyko, Nikos Komodakis, 2016

  • Adapting Models to Signal Degradation using Distillation, Jong-Chyi Su, Subhransu Maji,2016

  • Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, Antti Tarvainen, Harri Valpola, 2017

  • Data-Free Knowledge Distillation For Deep Neural Networks, Raphael Gontijo Lopes, Stefano Fenu, 2017

  • Like What You Like: Knowledge Distill via Neuron Selectivity Transfer, Zehao Huang, Naiyan Wang, 2017

  • Learning Loss for Knowledge Distillation with Conditional Adversarial Networks, Zheng Xu, Yen-Chang Hsu, Jiawei Huang, 2017

  • DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang, 2017

  • Knowledge Projection for Deep Neural Networks, Zhi Zhang, Guanghan Ning, Zhihai He, 2017

  • Moonshine: Distilling with Cheap Convolutions, Elliot J. Crowley, Gavin Gray, Amos Storkey, 2017

  • Local Affine Approximators for Improving Knowledge Transfer, Suraj Srinivas and Francois Fleuret, 2017

  • Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model, Jiasen Lu1, Anitha Kannan, Jianwei Yang, Devi Parikh, Dhruv Batra 2017

  • Learning Efficient Object Detection Models with Knowledge Distillation, Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, Manmohan Chandraker, 2017

  • Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification, Chong Wang, Xipeng Lan and Yangang Zhang, 2017

  • Learning Transferable Architectures for Scalable Image Recognition, Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le, 2017

  • Revisiting knowledge transfer for training object class detectors, Jasper Uijlings, Stefan Popov, Vittorio Ferrari, 2017

  • A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning, Junho Yim, Donggyu Joo, Jihoon Bae, Junmo Kim, 2017

  • Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2017

  • Data Distillation: Towards Omni-Supervised Learning, Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He, 2017

  • Interpreting Deep Classifiers by Visual Distillation of Dark Knowledge, Kai Xu, Dae Hoon Park, Chang Yi, Charles Sutton, 2018

  • Efficient Neural Architecture Search via Parameters Sharing, Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean, 2018

  • Transparent Model Distillation, Sarah Tan, Rich Caruana, Giles Hooker, Albert Gordo, 2018

  • Defensive Collaborative Multi-task Training - Defending against Adversarial Attack towards Deep Neural Networks, Derek Wang, Chaoran Li, Sheng Wen, Yang Xiang, Wanlei Zhou, Surya Nepal, 2018

  • Deep Co-Training for Semi-Supervised Image Recognition, Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo Wang, Alan Yuille, 2018

  • Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2018

  • Multimodal Recurrent Neural Networks with Information Transfer Layers for Indoor Scene Labeling, Abrar H. Abdulnabi, Bing Shuai, Zhen Zuo, Lap-Pui Chau, Gang Wang, 2018

  • Born Again Neural Networks, Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar, 2018

  • YASENN: Explaining Neural Networks via Partitioning Activation Sequences, Yaroslav Zharov, Denis Korzhenkov, Pavel Shvechikov, Alexander Tuzhilin, 2018

  • Knowledge Distillation with Adversarial Samples Supporting Decision Boundary, Byeongho Heo, Minsik Lee, Sangdoo Yun, Jin Young Choi, 2018

  • Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons, Byeongho Heo, Minsik Lee, Sangdoo Yun, Jin Young Choi, 2018

  • Self-supervised knowledge distillation using singular value decomposition, Seung Hyun Lee, Dae Ha Kim, Byung Cheol Song, 2018

  • Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection, Yongcheng Liu, Lu Sheng, Jing Shao, Junjie Yan, Shiming Xiang, Chunhong Pan, 2018

  • Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks, Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy, 2018

  • Deep Face Recognition Model Compression via Knowledge Transfer and Distillation, Jayashree Karlekar, Jiashi Feng, Zi Sian Wong, Sugiri Pranata, 2019

  • Relational Knowledge Distillation, Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho, 2019

  • Graph-based Knowledge Distillation by Multi-head Attention Network, Seunghyun Lee, Byung Cheol Song, 2019

  • Knowledge Adaptation for Efficient Semantic Segmentation, Tong He, Chunhua Shen, Zhi Tian, Dong Gong, Changming Sun, Youliang Yan, 2019

  • Structured Knowledge Distillation for Semantic Segmentation, Yifan Liu, Ke Chen, Chris Liu, Zengchang Qin, Zhenbo Luo, Jingdong Wang, 2019



【视频教程】

  • Dark knowledge, Geoffrey Hinton, 2014

  • Model Compression, Rich Caruana, 2016


【代码实现】


MXNet

  • Bayesian Dark Knowledge


PyTorch

  • Attention Transfer

  • Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model

  • Interpreting Deep Classifier by Visual Distillation of Dark Knowledge

  • A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility

  • Mean teachers are better role models

  • Neural Network Distiller by Intel AI Lab, distiller/knowledge_distillation.py

  • Relational Knowledge Distillation

  • Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons


Lua

  • Example for teacher/student-based learning


Torch

  • Distilling knowledge to specialist ConvNets for clustered classification

  • Sequence-Level Knowledge Distillation, Neural Machine Translation on Android

  • cifar.torch distillation


Theano

  • FitNets: Hints for Thin Deep Nets

  • Transfer knowledge from a large DNN or an ensemble of DNNs into a small DNN


Lasagne + Theano

  • Experiments-with-Distilling-Knowledge


Tensorflow

  • Deep Model Compression: Distilling Knowledge from Noisy Teachers

  • Distillation

  • An example application of neural network distillation to MNIST

  • Data-free Knowledge Distillation for Deep Neural Networks

  • Inspired by net2net, network distillation

  • Deep Reinforcement Learning, knowledge transfer

  • Knowledge Distillation using Tensorflow

  • Knowledge Distillation Methods with Tensorflow

  • Zero-Shot Knowledge Distillation in Deep Networks in ICML2019


Caffe

  • Face Model Compression by Distilling Knowledge from Neurons

  • KnowledgeDistillation Layer (Caffe implementation)

  • Knowledge distillation, realized in caffe

  • Cross Modal Distillation for Supervision Transfer

  • Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection


Keras

  • Knowledge distillation with Keras

  • keras google-vision's distillation

  • Distilling the knowledge in a Neural Network


更多内容,请访问该Github库。



-END-

专 · 知

专知,专业可信的人工智能知识分发,让认知协作更快更好!欢迎登录www.zhuanzhi.ai,注册登录专知,获取更多AI知识资料!

欢迎微信扫一扫加入专知人工智能知识星球群,获取最新AI专业干货知识教程视频资料和与专家交流咨询

请加专知小助手微信(扫一扫如下二维码添加),加入专知人工智能主题群,咨询技术商务合作~

专知《深度学习:算法到实战》课程全部完成!550+位同学在学习,现在报名,限时优惠!网易云课堂人工智能畅销榜首位!

点击“阅读原文”,了解报名专知《深度学习:算法到实战》课程

登录查看更多
3

相关内容

专知会员服务
109+阅读 · 2020年3月12日
深度强化学习策略梯度教程,53页ppt
专知会员服务
176+阅读 · 2020年2月1日
【深度学习视频分析/多模态学习资源大列表】
专知会员服务
91+阅读 · 2019年10月16日
强化学习最新教程,17页pdf
专知会员服务
168+阅读 · 2019年10月11日
机器学习入门的经验与建议
专知会员服务
90+阅读 · 2019年10月10日
MIT新书《强化学习与最优控制》
专知会员服务
270+阅读 · 2019年10月9日
【资源】领域自适应相关论文、代码分享
专知
30+阅读 · 2019年10月12日
手把手教你入门深度强化学习(附链接&代码)
THU数据派
75+阅读 · 2019年7月16日
Github项目推荐 | 图神经网络(GNN)相关资源大列表
从入门到精通-Tensorflow深度强化学习课程
深度学习与NLP
23+阅读 · 2019年3月7日
自然语言处理顶会EMNLP2018接受论文列表!
专知
87+阅读 · 2018年8月26日
论文荐读 | NLP之Attention从入门到精通
人工智能前沿讲习班
5+阅读 · 2018年5月14日
最前沿的深度学习论文、架构及资源分享
深度学习与NLP
13+阅读 · 2018年1月25日
干货 | 深度学习论文汇总
AI科技评论
4+阅读 · 2018年1月1日
【推荐】GAN架构入门综述(资源汇总)
机器学习研究会
10+阅读 · 2017年9月3日
Arxiv
5+阅读 · 2020年3月17日
Teacher-Student Training for Robust Tacotron-based TTS
Arxiv
53+阅读 · 2018年12月11日
Arxiv
5+阅读 · 2018年10月11日
VIP会员
相关VIP内容
专知会员服务
109+阅读 · 2020年3月12日
深度强化学习策略梯度教程,53页ppt
专知会员服务
176+阅读 · 2020年2月1日
【深度学习视频分析/多模态学习资源大列表】
专知会员服务
91+阅读 · 2019年10月16日
强化学习最新教程,17页pdf
专知会员服务
168+阅读 · 2019年10月11日
机器学习入门的经验与建议
专知会员服务
90+阅读 · 2019年10月10日
MIT新书《强化学习与最优控制》
专知会员服务
270+阅读 · 2019年10月9日
相关资讯
【资源】领域自适应相关论文、代码分享
专知
30+阅读 · 2019年10月12日
手把手教你入门深度强化学习(附链接&代码)
THU数据派
75+阅读 · 2019年7月16日
Github项目推荐 | 图神经网络(GNN)相关资源大列表
从入门到精通-Tensorflow深度强化学习课程
深度学习与NLP
23+阅读 · 2019年3月7日
自然语言处理顶会EMNLP2018接受论文列表!
专知
87+阅读 · 2018年8月26日
论文荐读 | NLP之Attention从入门到精通
人工智能前沿讲习班
5+阅读 · 2018年5月14日
最前沿的深度学习论文、架构及资源分享
深度学习与NLP
13+阅读 · 2018年1月25日
干货 | 深度学习论文汇总
AI科技评论
4+阅读 · 2018年1月1日
【推荐】GAN架构入门综述(资源汇总)
机器学习研究会
10+阅读 · 2017年9月3日
Top
微信扫码咨询专知VIP会员