Multi-task learning (MTL) aims to leverage shared knowledge across tasks to improve generalization and parameter efficiency, yet balancing resources and mitigating interference remain open challenges. Architectural solutions often introduce elaborate task-specific modules or routing schemes, increasing complexity and overhead. In this work, we show that normalization layers alone are sufficient to address many of these challenges. Simply replacing shared normalization with task-specific variants already yields competitive performance, questioning the need for complex designs. Building on this insight, we propose Task-Specific Sigmoid Batch Normalization (TS$σ$BN), a lightweight mechanism that enables tasks to softly allocate network capacity while fully sharing feature extractors. TS$σ$BN improves stability across CNNs and Transformers, matching or exceeding performance on NYUv2, Cityscapes, CelebA, and PascalContext, while remaining highly parameter-efficient. Moreover, its learned gates provide a natural framework for analyzing MTL dynamics, offering interpretable insights into capacity allocation, filter specialization, and task relationships. Our findings suggest that complex MTL architectures may be unnecessary and that task-specific normalization offers a simple, interpretable, and efficient alternative.
翻译:多任务学习(MTL)旨在利用跨任务的共享知识以提升泛化能力和参数效率,然而资源平衡与干扰缓解仍是待解决的挑战。架构解决方案通常引入复杂的任务特定模块或路由方案,增加了系统复杂性和开销。本研究表明,仅使用归一化层便足以应对其中许多挑战。仅将共享归一化替换为任务特定变体即可获得有竞争力的性能,这引发了对复杂设计必要性的质疑。基于此洞见,我们提出任务特定Sigmoid批量归一化(TS$σ$BN),这是一种轻量级机制,使任务能够软性分配网络容量,同时完全共享特征提取器。TS$σ$BN在CNN和Transformer架构中均提升了稳定性,在NYUv2、Cityscapes、CelebA和PascalContext数据集上达到或超越了现有性能,同时保持极高的参数效率。此外,其学习到的门控机制为分析MTL动态提供了天然框架,能够对容量分配、滤波器专业化及任务关系提供可解释的见解。我们的研究结果表明,复杂的MTL架构可能并非必需,而任务特定归一化提供了一种简单、可解释且高效的替代方案。