Privacy-preserving machine learning (PPML) is an emerging topic to handle secure machine learning inference over sensitive data in untrusted environments. Fully homomorphic encryption (FHE) enables computation directly on encrypted data on the server side, making it a promising approach for PPML. However, it introduces significant communication and computation overhead on the client side, making it impractical for edge devices. Hybrid homomorphic encryption (HHE) addresses this limitation by combining symmetric encryption (SE) with FHE to reduce the computational cost on the client side, and combining with an FHE-friendly SE can also lessen the processing overhead on the server side, making it a more balanced and efficient alternative. Our work proposes a hardware-accelerated HHE architecture built around a lightweight symmetric cipher optimized for FHE compatibility and implemented as a dedicated hardware accelerator. To the best of our knowledge, this is the first design to integrate an end-to-end HHE framework with hardware acceleration. Beyond this, we also present several microarchitectural optimizations to achieve higher performance and energy efficiency. The proposed work is integrated into a full PPML pipeline, enabling secure inference with significantly lower latency and power consumption than software implementations. Our contributions validate the feasibility of low-power, hardware- accelerated HHE for edge deployment and provide a hardware- software co-design methodology for building scalable, secure machine learning systems in resource-constrained environments. Experiments on a PYNQ-Z2 platform with the MNIST dataset show over a 50x reduction in client-side encryption latency and nearly a 2x gain in hardware throughput compared to existing FPGA-based HHE accelerators.
翻译:隐私保护机器学习(PPML)是处理非可信环境下敏感数据安全机器学习推理的新兴课题。全同态加密(FHE)允许在服务器端直接对加密数据进行计算,使其成为PPML的一种可行方案。然而,该方法在客户端引入了显著的通信与计算开销,难以在边缘设备上实际部署。混合同态加密(HHE)通过将对称加密(SE)与FHE相结合来降低客户端计算成本,同时采用与FHE兼容的对称加密方案也能减少服务器端处理开销,从而成为一种更均衡高效的选择。本研究提出一种硬件加速的HHE架构,其核心为针对FHE兼容性优化的轻量级对称密码算法,并以专用硬件加速器形式实现。据我们所知,这是首个将端到端HHE框架与硬件加速相结合的设计。此外,我们还提出了多项微架构优化以提升性能与能效。该方案被集成至完整的PPML流程中,相较于软件实现能够以显著更低的延迟与功耗完成安全推理。我们的工作验证了低功耗硬件加速HHE在边缘部署的可行性,并为在资源受限环境中构建可扩展的安全机器学习系统提供了软硬件协同设计方法。在搭载MNIST数据集的PYNQ-Z2平台上进行的实验表明,相较于现有基于FPGA的HHE加速器,客户端加密延迟降低超过50倍,硬件吞吐量提升近2倍。