Dynamic algorithm configuration (DAC) is a recent trend in automated machine learning, which can dynamically adjust the algorithm's configuration during the execution process and relieve users from tedious trial-and-error tuning tasks. Recently, multi-agent reinforcement learning (MARL) approaches have improved the configuration of multiple heterogeneous hyperparameters, making various parameter configurations for complex algorithms possible. However, many complex algorithms have inherent inter-dependencies among multiple parameters (e.g., determining the operator type first and then the operator's parameter), which are, however, not considered in previous approaches, thus leading to sub-optimal results. In this paper, we propose the sequential multi-agent DAC (Seq-MADAC) framework to address this issue by considering the inherent inter-dependencies of multiple parameters. Specifically, we propose a sequential advantage decomposition network, which can leverage action-order information through sequential advantage decomposition. Experiments from synthetic functions to the configuration of multi-objective optimization algorithms demonstrate Seq-MADAC's superior performance over state-of-the-art MARL methods and show strong generalization across problem classes. Seq-MADAC establishes a new paradigm for the widespread dependency-aware automated algorithm configuration. Our code is available at https://github.com/lamda-bbo/seq-madac.
翻译:动态算法配置(DAC)是自动化机器学习领域的新兴趋势,它能够在算法执行过程中动态调整其配置,从而将用户从繁琐的试错调参任务中解放出来。近年来,多智能体强化学习(MARL)方法改进了对多个异构超参数的配置,使得为复杂算法进行多样化参数配置成为可能。然而,许多复杂算法的多个参数之间存在固有的相互依赖关系(例如,先确定算子类型,再确定算子的参数),而现有方法并未考虑这些依赖关系,从而导致次优结果。本文提出序列多智能体动态算法配置(Seq-MADAC)框架,通过考虑多个参数之间的固有依赖关系来解决这一问题。具体而言,我们提出了一种序列优势分解网络,该网络能够通过序列优势分解利用动作顺序信息。从合成函数到多目标优化算法配置的实验表明,Seq-MADAC 的性能优于最先进的多智能体强化学习方法,并在不同问题类别上展现出强大的泛化能力。Seq-MADAC 为广泛存在的依赖感知自动化算法配置建立了新范式。我们的代码可在 https://github.com/lamda-bbo/seq-madac 获取。