Transfer effects manifest themselves both during training using a fixed data set and in inductive inference using accumulating data. We hypothesize that perturbing the data set by including more samples, instead of perturbing the model by gradient updates, provides a complementary and more fundamental characterization of transfer effects. To capture this phenomenon, we quantitatively model transfer effects using multi-task learning curves approximating the inductive performance over varying sample sizes. We describe an efficient method to approximate multi-task learning curves analogous to the Task Affinity Grouping method applied during training. We compare the statistical and computational approaches to transfer, which indicates considerably higher compute costs for the previous but better power and broader applicability. Evaluations are performed using a benchmark drug-target interaction data set. Our results show that learning curves can better capture the effects of multi-task learning and their multi-task extensions can delineate pairwise and contextual transfer effects in foundation models.
翻译:迁移效应既体现在使用固定数据集的训练过程中,也体现在利用累积数据进行归纳推理时。我们假设,通过增加样本量来扰动数据集(而非通过梯度更新扰动模型)能够提供对迁移效应更互补且更本质的表征。为捕捉这一现象,我们使用多任务学习曲线对迁移效应进行定量建模,该曲线可近似反映不同样本量下的归纳性能。我们提出一种高效方法来近似多任务学习曲线,其原理类似于训练过程中应用的Task Affinity Grouping方法。通过比较迁移的统计方法与计算方法,发现前者计算成本显著更高,但具有更好的统计功效和更广泛的适用性。我们在标准药物-靶点相互作用数据集上进行评估,结果表明:学习曲线能更有效地捕捉多任务学习效应,其多任务扩展形式可揭示基础模型中的成对迁移效应与上下文迁移效应。