ReAcTree：具有控制流的分层大语言模型智能体树用于长时程任务规划 (ReAcTree: Hierarchical LLM Agent Trees with Control Flow for Long-Horizon Task Planning)

Recent advancements in large language models (LLMs) have enabled significant progress in decision-making and task planning for embodied autonomous agents. However, most existing methods still struggle with complex, long-horizon tasks because they rely on a monolithic trajectory that entangles all past decisions and observations, attempting to solve the entire task in a single unified process. To address this limitation, we propose ReAcTree, a hierarchical task-planning method that decomposes a complex goal into more manageable subgoals within a dynamically constructed agent tree. Each subgoal is handled by an LLM agent node capable of reasoning, acting, and further expanding the tree, while control flow nodes coordinate the execution strategies of agent nodes. In addition, we integrate two complementary memory systems: each agent node retrieves goal-specific, subgoal-level examples from episodic memory and shares environment-specific observations through working memory. Experiments on the WAH-NL and ALFRED datasets demonstrate that ReAcTree consistently outperforms strong task-planning baselines such as ReAct across diverse LLMs. Notably, on WAH-NL, ReAcTree achieves a 61% goal success rate with Qwen 2.5 72B, nearly doubling ReAct's 31%.

翻译：近期大语言模型（LLMs）的进展显著推动了具身自主智能体的决策与任务规划能力。然而，现有方法大多仍难以处理复杂的长时程任务，因为它们依赖于一个单一的整体轨迹，该轨迹混杂了所有过往决策与观测信息，试图在单一统一过程中解决整个任务。为克服这一局限，我们提出了ReAcTree——一种分层任务规划方法，它将复杂目标在动态构建的智能体树中分解为更易管理的子目标。每个子目标由一个能够进行推理、执行并进一步扩展树的LLM智能体节点处理，而控制流节点则协调智能体节点的执行策略。此外，我们整合了两个互补的记忆系统：每个智能体节点从情景记忆中检索目标特定、子目标层级的示例，并通过工作记忆共享环境特定的观测信息。在WAH-NL和ALFRED数据集上的实验表明，ReAcTree在不同LLMs上均持续优于ReAct等强任务规划基线方法。值得注意的是，在WAH-NL数据集上，ReAcTree使用Qwen 2.5 72B模型实现了61%的目标成功率，几乎是ReAct方法31%成功率的两倍。