HALF:FPGAs综合自动机学习 (HALF: Holistic Auto Machine Learning for FPGAs)

from arxiv, Submitted at FPL2021. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Deep Neural Networks (DNNs) are capable of solving complex problems in domains related to embedded systems, such as image and natural language processing. To efficiently implement DNNs on a specific FPGA platform for a given cost criterion, e.g. energy efficiency, an enormous amount of design parameters has to be considered from the topology down to the final hardware implementation. Interdependencies between the different design layers have to be taken into account and explored efficiently, making it hardly possible to find optimized solutions manually. An automatic, holistic design approach can improve the quality of DNN implementations on FPGA significantly. To this end, we present a cross-layer design space exploration methodology. It comprises optimizations starting from a hardware-aware topology search for DNNs down to the final optimized implementation for a given FPGA platform. The methodology is implemented in our Holistic Auto machine Learning for FPGAs (HALF) framework, which combines an evolutionary search algorithm, various optimization steps and a library of parametrizable hardware DNN modules. HALF automates both the exploration process and the implementation of optimized solutions on a target FPGA platform for various applications. We demonstrate the performance of HALF on a medical use case for arrhythmia detection for three different design goals, i.e. low-energy, low-power and high-throughput respectively. Our FPGA implementation outperforms a TensorRT optimized model on an Nvidia Jetson platform in both throughput and energy consumption.

翻译：深神经网络(DNN)能够解决嵌入系统相关领域的复杂问题,如图像和自然语言处理等。为了在特定的成本标准,例如能源效率,在特定的FPGA平台上高效地实施DNNS,需要考虑从地形到最后硬件实施的大量设计参数。不同设计层之间的相互依存关系必须加以考虑和有效探索,这几乎不可能手工找到最佳解决方案。一个自动、整体的设计方法可以显著改善FPGA上DNNN执行的质量。为此,我们提出了一个跨层设计空间探索方法。它包括从对DNNNGA的硬件认知表层搜索到最终优化实施给定的FPGA平台。该方法在我们的FPA全自动学习框架中得到实施,该框架结合了进化搜索算法、各种优化步骤和可蛋白硬硬件DNNNM模块的图书馆。HALFA自动化软件的探索过程和在HAFPA目标下实施一个最佳水平标准,在HAFA的测试中分别展示了一个高压水平标准标准,在HAFA 的测试中,在三个目标测试中,一个最佳的高级智能测试应用一个标准。