学习何处、何物及如何迁移：一种面向进化多任务优化的多角色强化学习方法 (Learning Where, What and How to Transfer: A Multi-Role Reinforcement Learning Approach for Evolutionary Multitasking)

Evolutionary multitasking (EMT) algorithms typically require tailored designs for knowledge transfer, in order to assure convergence and optimality in multitask optimization. In this paper, we explore designing a systematic and generalizable knowledge transfer policy through Reinforcement Learning. We first identify three major challenges: determining the task to transfer (where), the knowledge to be transferred (what) and the mechanism for the transfer (how). To address these challenges, we formulate a multi-role RL system where three (groups of) policy networks act as specialized agents: a task routing agent incorporates an attention-based similarity recognition module to determine source-target transfer pairs via attention scores; a knowledge control agent determines the proportion of elite solutions to transfer; and a group of strategy adaptation agents control transfer strength by dynamically controlling hyper-parameters in the underlying EMT framework. Through pre-training all network modules end-to-end over an augmented multitask problem distribution, a generalizable meta-policy is obtained. Comprehensive validation experiments show state-of-the-art performance of our method against representative baselines. Further in-depth analysis not only reveals the rationale behind our proposal but also provide insightful interpretations on what the system have learned.

翻译：进化多任务（EMT）算法通常需要针对知识迁移进行定制化设计，以确保多任务优化中的收敛性与最优性。本文探索通过强化学习设计一种系统化且可泛化的知识迁移策略。我们首先识别出三大挑战：确定迁移任务（何处）、待迁移知识（何物）以及迁移机制（如何）。为应对这些挑战，我们构建了一个多角色强化学习系统，其中三组策略网络作为专用智能体运作：任务路由智能体通过集成基于注意力的相似性识别模块，依据注意力分数确定源-目标迁移对；知识控制智能体决定待迁移精英解的比例；策略自适应智能体组通过动态调控底层EMT框架中的超参数来控制迁移强度。通过在增强的多任务问题分布上对所有网络模块进行端到端预训练，我们获得了一个可泛化的元策略。综合验证实验表明，相较于代表性基线方法，本方法取得了最先进的性能。进一步的深入分析不仅揭示了本方案的设计原理，还对其学习到的系统行为提供了具有洞察力的解释。