Voice assistants utilize Keyword Spotting (KWS) to enable efficient, privacy-friendly activation. However, realizing accurate KWS models on ultra-low-power TinyML devices (often with less than $<2$ MB of flash memory) necessitates a delicate balance between accuracy with strict resource constraints. Multi-objective Bayesian Optimization (MOBO) is an ideal candidate for managing such a trade-off but is highly initialization-dependent, especially under the budgeted black-box setting. Existing methods typically fall back to naive, ad-hoc sampling routines (e.g., Latin Hypercube Sampling (LHS), Sobol sequences, or Random search) that are adapted to neither the Pareto front nor undergo rigorous statistical comparison. To address this, we propose Objective-Aware Surrogate Initialization (OASI), a novel initialization strategy that leverages Multi-Objective Simulated Annealing (MOSA) to generate a seed Pareto set of high-performing and diverse configurations that explicitly balance accuracy and model size. Evaluated in a TinyML KWS setting, OASI outperforms LHS, Sobol, and Random initialization, achieving the highest hypervolume (0.0627) and the lowest generational distance (0.0) across multiple runs, with only a modest increase in computation time (1934 s vs. $\sim$1500 s). A non-parametric statistical analysis using the Kruskal-Wallis test ($H = 5.40$, $p = 0.144$, $η^2 = 0.0007$) and Dunn's post-hoc test confirms OASI's superior consistency despite the non-significant overall difference with respect to the $α=0.05$ threshold.
翻译:语音助手利用关键词检测(KWS)实现高效、保护隐私的激活功能。然而,在超低功耗TinyML设备(通常配备不足2 MB闪存)上实现精确的KWS模型,需要在准确性与严格资源约束之间取得微妙平衡。多目标贝叶斯优化(MOBO)是管理此类权衡的理想方法,但其高度依赖初始化策略,尤其在预算受限的黑盒优化场景中。现有方法通常退化为简单且临时的采样策略(如拉丁超立方采样、Sobol序列或随机搜索),这些策略既未针对帕累托前沿进行适配,也未经过严格的统计比较。为此,我们提出目标感知代理初始化(OASI),这是一种新颖的初始化策略,利用多目标模拟退火(MOSA)生成高性能且多样化的种子帕累托配置集合,显式平衡准确性与模型大小。在TinyML KWS场景中的评估表明,OASI在多次运行中均优于拉丁超立方采样、Sobol序列及随机初始化,获得了最高的超体积指标(0.0627)和最低的代际距离(0.0),而计算时间仅适度增加(1934秒对比约1500秒)。采用Kruskal-Wallis检验(H = 5.40, p = 0.144, η² = 0.0007)与Dunn事后检验的非参数统计分析证实,尽管在α=0.05阈值下整体差异不显著,OASI仍表现出更优的稳定性。