ZiCo:通过梯度变异的反相系数零射射NAS</s> (ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients)

Neural Architecture Search (NAS) is widely used to automatically obtain the neural network with the best performance among a large number of candidate architectures. To reduce the search time, zero-shot NAS aims at designing training-free proxies that can predict the test performance of a given architecture. However, as shown recently, none of the zero-shot proxies proposed to date can actually work consistently better than a naive proxy, namely, the number of network parameters (#Params). To improve this state of affairs, as the main theoretical contribution, we first reveal how some specific gradient properties across different samples impact the convergence rate and generalization capacity of neural networks. Based on this theoretical analysis, we propose a new zero-shot proxy, ZiCo, the first proxy that works consistently better than #Params. We demonstrate that ZiCo works better than State-Of-The-Art (SOTA) proxies on several popular NAS-Benchmarks (NASBench101, NATSBench-SSS/TSS, TransNASBench-101) for multiple applications (e.g., image classification/reconstruction and pixel-level prediction). Finally, we demonstrate that the optimal architectures found via ZiCo are as competitive as the ones found by one-shot and multi-shot NAS methods, but with much less search time. For example, ZiCo-based NAS can find optimal architectures with 78.1%, 79.4%, and 80.4% test accuracy under inference budgets of 450M, 600M, and 1000M FLOPs, respectively, on ImageNet within 0.4 GPU days. Our code is available at https://github.com/SLDGroup/ZiCo.

翻译：神经结构搜索(NAS) 被广泛用于自动获取神经网络, 其性能在众多候选架构中表现最佳。为了缩短搜索时间, 零点NAS 旨在设计无培训的代理器, 以预测某个架构的测试性能。然而, 如最近显示的那样, 迄今提出的零点代理器中没有一个比天真的代理器( 即#Params) 更能有效。为改善这种状况, 作为主要理论贡献, 我们首先揭示了不同样本中某些特定的梯度特性如何影响神经网络的趋同率和总体化能力。根据这一理论分析, 我们提出了一个新的零点代理器, 可以预测某个特定结构的测试性能比#Params。我们证明, Zico比国家- The-Art (SOTA) 网络参数(#PATA) 的性能更好。一些流行的NAS- Benchmarks( NASB) 101, NATS-MS- TSS- TS, TranswerMS- TransadNAS) 和 TransadNAS- 101) 应用( 内部的Silental Silental Silal Sal Sildal/ real/ real) exional Sildal) 和多级的S-sal- supal- sal- sal- supal- supal- supal- supal- supal/ real- sal- sal- sal- sal- sal- salationalationsalationalationsal- sal- sal- sal- sal- 和s) y- sal- supal- supal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- 和s 和sal- sal-salationsal- salationsalationsalationsalationsalations) 和s 和s fal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- 和多级, 和s, 和</s>