面向跨模态检索的鲁棒与协调自适应方法 (Toward Robust and Harmonious Adaptation for Cross-modal Retrieval)

Recently, the general-to-customized paradigm has emerged as the dominant approach for Cross-Modal Retrieval (CMR), which reconciles the distribution shift problem between the source domain and the target domain. However, existing general-to-customized CMR methods typically assume that the entire target-domain data is available, which is easily violated in real-world scenarios and thus inevitably suffer from the query shift (QS) problem. Specifically, query shift embraces the following two characteristics and thus poses new challenges to CMR. i) Online Shift: real-world queries always arrive in an online manner, rendering it impractical to access the entire query set beforehand for customization approaches; ii) Diverse Shift: even with domain customization, the CMR models struggle to satisfy queries from diverse users or scenarios, leaving an urgent need to accommodate diverse queries. In this paper, we observe that QS would not only undermine the well-structured common space inherited from the source model, but also steer the model toward forgetting the indispensable general knowledge for CMR. Inspired by the observations, we propose a novel method for achieving online and harmonious adaptation against QS, dubbed Robust adaptation with quEry ShifT (REST). To deal with online shift, REST first refines the retrieval results to formulate the query predictions and accordingly designs a QS-robust objective function on these predictions to preserve the well-established common space in an online manner. As for tackling the more challenging diverse shift, REST employs a gradient decoupling module to dexterously manipulate the gradients during the adaptation process, thus preventing the CMR model from forgetting the general knowledge. Extensive experiments on 20 benchmarks across three CMR tasks verify the effectiveness of our method against QS.

翻译：近年来，通用到定制化范式已成为跨模态检索（CMR）的主流方法，旨在缓解源域与目标域之间的分布偏移问题。然而，现有通用到定制化CMR方法通常假设整个目标域数据均可获取，这一假设在实际场景中极易被违背，因而不可避免地遭受查询偏移（QS）问题的困扰。具体而言，查询偏移具有以下两个特征，从而对CMR提出了新的挑战：i）在线偏移：真实世界中的查询总是以在线方式到达，使得定制化方法无法预先访问完整查询集；ii）多样偏移：即使经过领域定制，CMR模型仍难以满足来自不同用户或场景的查询，亟需适应多样化查询需求。本文发现，查询偏移不仅会破坏从源模型继承的结构化公共空间，还会导致模型遗忘CMR所必需的通用知识。基于此观察，我们提出一种名为鲁棒查询偏移自适应（REST）的新方法，以实现针对查询偏移的在线协调自适应。为应对在线偏移，REST首先优化检索结果以构建查询预测，并据此设计基于这些预测的QS鲁棒目标函数，以在线方式维护已建立的公共空间。针对更具挑战性的多样偏移，REST采用梯度解耦模块，在自适应过程中巧妙调控梯度流向，从而防止CMR模型遗忘通用知识。在三个CMR任务涉及的20个基准数据集上的大量实验验证了本方法应对查询偏移的有效性。