序列多智能体动态算法配置 (Sequential Multi-Agent Dynamic Algorithm Configuration)

Dynamic algorithm configuration (DAC) is a recent trend in automated machine learning, which can dynamically adjust the algorithm's configuration during the execution process and relieve users from tedious trial-and-error tuning tasks. Recently, multi-agent reinforcement learning (MARL) approaches have improved the configuration of multiple heterogeneous hyperparameters, making various parameter configurations for complex algorithms possible. However, many complex algorithms have inherent inter-dependencies among multiple parameters (e.g., determining the operator type first and then the operator's parameter), which are, however, not considered in previous approaches, thus leading to sub-optimal results. In this paper, we propose the sequential multi-agent DAC (Seq-MADAC) framework to address this issue by considering the inherent inter-dependencies of multiple parameters. Specifically, we propose a sequential advantage decomposition network, which can leverage action-order information through sequential advantage decomposition. Experiments from synthetic functions to the configuration of multi-objective optimization algorithms demonstrate Seq-MADAC's superior performance over state-of-the-art MARL methods and show strong generalization across problem classes. Seq-MADAC establishes a new paradigm for the widespread dependency-aware automated algorithm configuration. Our code is available at https://github.com/lamda-bbo/seq-madac.

翻译：动态算法配置（DAC）是自动化机器学习领域的新兴趋势，它能够在算法执行过程中动态调整其配置，从而将用户从繁琐的试错调参任务中解放出来。近年来，多智能体强化学习（MARL）方法改进了对多个异构超参数的配置，使得为复杂算法进行多样化参数配置成为可能。然而，许多复杂算法的多个参数之间存在固有的相互依赖关系（例如，先确定算子类型，再确定算子的参数），而现有方法并未考虑这些依赖关系，从而导致次优结果。本文提出序列多智能体动态算法配置（Seq-MADAC）框架，通过考虑多个参数之间的固有依赖关系来解决这一问题。具体而言，我们提出了一种序列优势分解网络，该网络能够通过序列优势分解利用动作顺序信息。从合成函数到多目标优化算法配置的实验表明，Seq-MADAC 的性能优于最先进的多智能体强化学习方法，并在不同问题类别上展现出强大的泛化能力。Seq-MADAC 为广泛存在的依赖感知自动化算法配置建立了新范式。我们的代码可在 https://github.com/lamda-bbo/seq-madac 获取。