This paper proposes a training data augmentation pipeline that combines synthetic image data with neural style transfer in order to address the vulnerability of deep vision models to common corruptions. We show that although applying style transfer on synthetic images degrades their quality with respect to the common Frechet Inception Distance (FID) metric, these images are surprisingly beneficial for model training. We conduct a systematic empirical analysis of the effects of both augmentations and their key hyperparameters on the performance of image classifiers. Our results demonstrate that stylization and synthetic data complement each other well and can be combined with popular rule-based data augmentation techniques such as TrivialAugment, while not working with others. Our method achieves state-of-the-art corruption robustness on several small-scale image classification benchmarks, reaching 93.54%, 74.9% and 50.86% robust accuracy on CIFAR-10-C, CIFAR-100-C and TinyImageNet-C, respectively
翻译:本文提出一种结合合成图像数据与神经风格迁移的训练数据增强流程,以解决深度视觉模型对常见图像损坏的脆弱性问题。研究表明,尽管对合成图像应用风格迁移会降低其在常用弗雷歇起始距离(FID)指标下的质量,但这些图像对模型训练具有出人意料的益处。我们系统性地实证分析了两种增强方式及其关键超参数对图像分类器性能的影响。实验结果表明:风格化与合成数据能形成良好互补,且可与TrivialAugment等主流基于规则的数据增强技术协同使用(但与其他增强方法存在兼容性问题)。该方法在多个小规模图像分类基准测试中实现了最先进的损坏鲁棒性,在CIFAR-10-C、CIFAR-100-C和TinyImageNet-C数据集上分别达到93.54%、74.9%和50.86%的鲁棒准确率。