检索增强的提示学习用于预训练基础模型 (Retrieval-augmented Prompt Learning for Pre-trained Foundation Models)

The pre-trained foundation models (PFMs) have become essential for facilitating large-scale multimodal learning. Researchers have effectively employed the ``pre-train, prompt, and predict'' paradigm through prompt learning to induce improved few-shot performance. However, prompt learning approaches for PFMs still follow a parametric learning paradigm. As such, the stability of generalization in memorization and rote learning can be compromised. More specifically, conventional prompt learning might face difficulties in fully utilizing atypical instances and avoiding overfitting to shallow patterns with limited data during the process of fully-supervised training. To overcome these constraints, we present our approach, named RetroPrompt, which aims to achieve a balance between memorization and generalization by decoupling knowledge from mere memorization. Unlike traditional prompting methods, RetroPrompt leverages a publicly accessible knowledge base generated from the training data and incorporates a retrieval mechanism throughout the input, training, and inference stages. This enables the model to actively retrieve relevant contextual information from the corpus, thereby enhancing the available cues. We conduct comprehensive experiments on a variety of datasets across natural language processing and computer vision tasks to demonstrate the superior performance of our proposed approach, RetroPrompt, in both zero-shot and few-shot scenarios. Through detailed analysis of memorization patterns, we observe that RetroPrompt effectively reduces the reliance on rote memorization, leading to enhanced generalization.

翻译：预训练基础模型已成为促进大规模多模态学习的关键技术。研究者通过提示学习有效运用"预训练、提示、预测"范式，以提升少样本学习性能。然而，现有针对预训练基础模型的提示学习方法仍遵循参数化学习范式，导致记忆与机械学习中的泛化稳定性可能受到影响。具体而言，传统提示学习方法在完全监督训练过程中，可能难以充分利用非典型样本，并避免对有限数据中的浅层模式产生过拟合。为突破这些限制，我们提出名为RetroPrompt的新方法，旨在通过解耦知识与单纯记忆来实现记忆与泛化的平衡。与传统提示方法不同，RetroPrompt利用从训练数据生成的公开可访问知识库，并在输入、训练和推理全阶段引入检索机制。这使得模型能够主动从语料库中检索相关上下文信息，从而增强可用线索。我们在自然语言处理和计算机视觉任务的多类数据集上进行了全面实验，证明所提出的RetroPrompt方法在零样本和少样本场景中均具有优越性能。通过对记忆模式的详细分析，我们观察到RetroPrompt能有效降低对机械记忆的依赖，从而提升泛化能力。