High-performance computing (HPC) systems expose many interdependent configuration knobs that impact runtime, resource usage, power, and variability. Existing predictive tools model these outcomes, but do not support structured exploration, explanation, or guided reconfiguration. We present WANDER, a decision-support framework that synthesizes alternate configurations using counterfactual analysis aligned with user goals and constraints. We introduce a composite trade-off score that ranks suggestions based on prediction uncertainty, consistency between feature-target relationships using causal models, and similarity between feature distributions against historical data. To our knowledge, WANDER is the first such system to unify prediction, exploration, and explanation for HPC tuning under a common query interface. Across multiple datasets WANDER generates interpretable and trustworthy, human-readable alternatives that guide users to achieve their performance objectives.
翻译:高性能计算(HPC)系统暴露了许多相互依赖的配置选项,这些选项会影响运行时、资源使用、功耗及性能波动。现有的预测工具能够对这些结果进行建模,但不支持结构化探索、解释或引导式重配置。本文提出WANDER,一种决策支持框架,它通过反事实分析综合生成符合用户目标与约束的备选配置方案。我们引入了一种综合权衡评分,该评分基于预测不确定性、利用因果模型衡量的特征-目标关系一致性,以及特征分布与历史数据的相似性来对建议方案进行排序。据我们所知,WANDER是首个在统一查询接口下,将HPC调优的预测、探索与解释功能集于一体的系统。在多个数据集上的实验表明,WANDER能够生成可解释、可信赖且人类可读的备选方案,从而引导用户实现其性能目标。