Agentic Workflows (AWs) have emerged as a promising paradigm for solving complex tasks. However, the scalability of automating their generation is severely constrained by the high cost and latency of execution-based evaluation. Existing AW performance prediction methods act as surrogates but fail to simultaneously capture the intricate topological dependencies and the deep semantic logic embedded in AWs. To address this limitation, we propose GLOW, a unified framework for AW performance prediction that combines the graph-structure modeling capabilities of GNNs with the reasoning power of LLMs. Specifically, we introduce a graph-oriented LLM, instruction-tuned on graph tasks, to extract topologically aware semantic features, which are fused with GNN-encoded structural representations. A contrastive alignment strategy further refines the latent space to distinguish high-quality AWs. Extensive experiments on FLORA-Bench show that GLOW outperforms state-of-the-art baselines in prediction accuracy and ranking utility.
翻译:智能体工作流已成为解决复杂任务的一种有前景的范式。然而,基于执行的评估方法成本高昂且延迟显著,严重制约了其自动化生成的可扩展性。现有的智能体工作流性能预测方法虽可作为替代方案,但未能同时捕捉工作流中复杂的拓扑依赖关系和深层的语义逻辑。为克服这一局限,我们提出了GLOW,一个统一的智能体工作流性能预测框架,该框架结合了图神经网络在图结构建模方面的能力与大语言模型的推理能力。具体而言,我们引入了一种面向图结构的大语言模型,通过在图形任务上进行指令微调,以提取具有拓扑感知的语义特征,并与图神经网络编码的结构表示进行融合。进一步采用对比对齐策略优化潜在空间,以区分高质量的智能体工作流。在FLORA-Bench上的大量实验表明,GLOW在预测准确性和排序效用方面均优于现有最先进的基线方法。