高性能大尺度图像识别,无正常化 (High-Performance Large-Scale Image Recognition Without Normalization) - 专知论文

会员服务 ·

0

模型评估 · 规范化的 · MoDELS · 梯度截断 · ImageNet (数据集) ·

2021 年 2 月 11 日

High-Performance Large-Scale Image Recognition Without Normalization

翻译：高性能大尺度图像识别,无正常化

Andrew Brock,Soham De,Samuel L. Smith,Karen Simonyan

Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples. Although recent work has succeeded in training deep ResNets without normalization layers, these models do not match the test accuracies of the best batch-normalized networks, and are often unstable for large learning rates or strong data augmentations. In this work, we develop an adaptive gradient clipping technique which overcomes these instabilities, and design a significantly improved class of Normalizer-Free ResNets. Our smaller models match the test accuracy of an EfficientNet-B7 on ImageNet while being up to 8.7x faster to train, and our largest models attain a new state-of-the-art top-1 accuracy of 86.5%. In addition, Normalizer-Free models attain significantly better performance than their batch-normalized counterparts when finetuning on ImageNet after large-scale pre-training on a dataset of 300 million labeled images, with our best models obtaining an accuracy of 89.2%. Our code is available at https://github.com/deepmind/ deepmind-research/tree/master/nfnets

翻译：批量正常化是大多数图像分类模型的一个关键组成部分,但它有许多不可取的特性,因为它依赖批量规模和实例之间的相互作用。尽管最近的工作成功地在培训深度的ResNet中培训了深度的ResNet,但没有正常化层,但这些模型与最佳批量标准化网络的测试范围不匹配,而且对于高学习率或强数据增强而言往往不稳定。在这项工作中,我们开发了适应性梯度剪裁技术,克服了这些不稳定性,并设计了一个显著改进的普通化无源ResNet类。我们较小的模型匹配了图像网络上高效的Net-B7的测试精度,同时正在更快地培训8.7x,而我们最大的模型达到了新的艺术一级-1的精度,即86.5%。此外,在对3亿个标签图像集进行大规模预培训后,在对图像网络进行微调时,无源模型的性能大大优于其分正比。我们的最佳模型获得了89.2%的精度。我们的代码可以在 https://github.com/diepmind/ deepnet/strenetmaster/strestrain/mastery。

5

相关内容

模型评估

机器学习系统设计系统评估标准

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICML2020】小样本目标检测

【ICML2020】小样本目标检测

专知会员服务

91+阅读 · 2020年6月2日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【NeurIPS2019】高性能浅层RNN的类脑目标识别（Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs）

【NeurIPS2019】高性能浅层RNN的类脑目标识别（Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs）

专知会员服务

13+阅读 · 2019年12月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

159+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

机器学习研究会

5+阅读 · 2017年9月24日

Graph Contrastive Learning with Augmentations

Arxiv

2+阅读 · 2021年4月3日

TResNet: High Performance GPU-Dedicated Architecture

TResNet: High Performance GPU-Dedicated Architecture

Arxiv

8+阅读 · 2020年3月30日

Adversarial Examples Improve Image Recognition

Arxiv

4+阅读 · 2019年11月21日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Arxiv

3+阅读 · 2019年9月25日

Few-shot 3D Multi-modal Medical Image Segmentation using Generative Adversarial Learning

Few-shot 3D Multi-modal Medical Image Segmentation using Generative Adversarial Learning

Arxiv

9+阅读 · 2018年10月29日

Large Scale GAN Training for High Fidelity Natural Image Synthesis

Arxiv

5+阅读 · 2018年9月28日

Stock Chart Pattern recognition with Deep Learning

Stock Chart Pattern recognition with Deep Learning

Arxiv

6+阅读 · 2018年8月1日

Group Normalization

Arxiv

7+阅读 · 2018年3月22日

Large-Scale Image Retrieval with Attentive Deep Local Features

Arxiv

3+阅读 · 2018年2月3日

VIP会员

文章信息

相关主题

ImageNet (数据集)

相关VIP内容

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICML2020】小样本目标检测

【ICML2020】小样本目标检测

专知会员服务

91+阅读 · 2020年6月2日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【NeurIPS2019】高性能浅层RNN的类脑目标识别（Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs）

【NeurIPS2019】高性能浅层RNN的类脑目标识别（Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs）

专知会员服务

13+阅读 · 2019年12月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

159+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML2025】用于持续多模态指令微调的动态课程化LoRA专家混合机制

生成模型中持续学习的综合综述

【斯坦福博士论文】通过以人为本的自然语言界面拓展 AI 的可及性

【新书】《LangChain生成式AI实战：使用 Python 与 LangGraph 构建大语言模型应用与高级智能体》

相关资讯

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

机器学习研究会

5+阅读 · 2017年9月24日

相关论文

Graph Contrastive Learning with Augmentations

Arxiv

2+阅读 · 2021年4月3日

TResNet: High Performance GPU-Dedicated Architecture

TResNet: High Performance GPU-Dedicated Architecture

Arxiv

8+阅读 · 2020年3月30日

Adversarial Examples Improve Image Recognition

Arxiv

4+阅读 · 2019年11月21日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Arxiv

3+阅读 · 2019年9月25日

Few-shot 3D Multi-modal Medical Image Segmentation using Generative Adversarial Learning

Few-shot 3D Multi-modal Medical Image Segmentation using Generative Adversarial Learning

Arxiv

9+阅读 · 2018年10月29日

Large Scale GAN Training for High Fidelity Natural Image Synthesis

Arxiv

5+阅读 · 2018年9月28日

Stock Chart Pattern recognition with Deep Learning

Stock Chart Pattern recognition with Deep Learning

Arxiv

6+阅读 · 2018年8月1日

Group Normalization

Arxiv

7+阅读 · 2018年3月22日

Large-Scale Image Retrieval with Attentive Deep Local Features

Arxiv

3+阅读 · 2018年2月3日

微信扫码咨询专知VIP会员