Performance of end-to-end neural networks on a given hardware platform is a function of its compute and memory signature, which in-turn, is governed by a wide range of parameters such as topology size, primitives used, framework used, batching strategy, latency requirements, precision etc. Current benchmarking tools suffer from limitations such as a) being either too granular like DeepBench (or) b) mandate a working implementation that is either framework specific or hardware-architecture specific (or) c) provide only high level benchmark metrics. In this paper, we present NTP (Neural Net Topology Profiler), a sophisticated benchmarking framework, to effectively identify memory and compute signature of an end-to-end topology on multiple hardware architectures, without the need to actually implement the topology in a framework. NTP is tightly integrated with hardware specific benchmark tools to enable exhaustive data collection and analysis. Using NTP, a deep learning researcher can quickly establish baselines needed to understand performance of an end-to-end neural network topology and make high level architectural decisions based on optimization techniques like layer sizing, quantization, pruning etc. Further, integration of NTP with frameworks like Tensorflow, Pytorch, Intel OpenVINO etc. allows for performance comparison along several vectors like a) Comparison of different frameworks on a given hardware b) Comparison of different hardware using a given framework c) Comparison across different heterogeneous hardware configurations for given framework etc. These capabilities empower a researcher to effortlessly make architectural decisions needed for achieving optimized performance on any hardware platform. The paper documents the architectural approach of NTP and demonstrates the capabilities of the tool by benchmarking Mozilla DeepSpeech, a popular Speech Recognition topology.
翻译:特定硬件平台端对端神经网络的性能是其计算和内存特征的一种函数。 后者的性能是其计算和内存特征的一种函数。 在本文中,我们展示了NTP(Neal Net地形剖析仪),一个复杂的基准框架,以有效识别多硬件结构的表层大小、使用的原始、使用的框架、分批战略、延时要求、精密等参数。 当前的基准工具存在一些局限性,例如,像DeepBench (或) b) 那样,过于颗粒性,像DeepBench (或) b) 那样,要求有一个特定的框架或硬件结构结构(或) 特定(或(c) 只能提供高水平的通用基准度指标。 我们展示了NTP(Neal 网络地形剖析剖析仪), 一个复杂的基准框架,以有效识别多硬件结构结构的端对端表的签名,而无需在框架内实际执行。 NTP 将一些硬度的硬度决定, 与不同的硬度框架相对比。