LPT：用于图像分类的长尾提示调整 (LPT: Long-tailed Prompt Tuning for Image Classification)

For long-tailed classification, most works often pretrain a big model on a large-scale dataset, and then fine-tune the whole model for adapting to long-tailed data. Though promising, fine-tuning the whole pretrained model tends to suffer from high cost in computation and deployment of different models for different tasks, as well as weakened generalization ability for overfitting to certain features of long-tailed data. To alleviate these issues, we propose an effective Long-tailed Prompt Tuning method for long-tailed classification. LPT introduces several trainable prompts into a frozen pretrained model to adapt it to long-tailed data. For better effectiveness, we divide prompts into two groups: 1) a shared prompt for the whole long-tailed dataset to learn general features and to adapt a pretrained model into target domain; and 2) group-specific prompts to gather group-specific features for the samples which have similar features and also to empower the pretrained model with discrimination ability. Then we design a two-phase training paradigm to learn these prompts. In phase 1, we train the shared prompt via supervised prompt tuning to adapt a pretrained model to the desired long-tailed domain. In phase 2, we use the learnt shared prompt as query to select a small best matched set for a group of similar samples from the group-specific prompt set to dig the common features of these similar samples, then optimize these prompts with dual sampling strategy and asymmetric GCL loss. By only fine-tuning a few prompts while fixing the pretrained model, LPT can reduce training and deployment cost by storing a few prompts, and enjoys a strong generalization ability of the pretrained model. Experiments show that on various long-tailed benchmarks, with only ~1.1% extra parameters, LPT achieves comparable performance than previous whole model fine-tuning methods, and is more robust to domain-shift.

翻译：对于长尾分类，大多数方法通常在大规模数据集上预训练一个大模型，然后对整个模型进行微调以适应长尾数据。虽然这是有前途的，但微调整个预训练模型往往面临高计算成本和为不同任务部署不同模型的问题，以及对某些长尾数据特征过度拟合的减弱泛化能力问题。为了缓解这些问题，我们提出了一种有效的长尾提示调整方法 LPT 用于长尾分类。LPT 在保持预训练模型冻结的情况下，引入几个可训练的提示，以使其适应长尾数据。为了更好的效果，我们将提示分为两组：1）用于整个长尾数据集的共享提示，以学习通用特征并将预训练模型适应到目标领域；2）组特定提示，以收集具有相似特征的样本的组特定特征，并赋予预训练模型区分能力。然后，我们设计了一个两阶段的训练模式来学习这些提示。在阶段 1 中，我们使用监督提示调整训练共享提示，以使预训练模型适应所需的长尾域。在阶段 2 中，我们使用所学共享提示作为查询，从组特定提示集中选择一个较小的最佳匹配样本集，以发现这些样本的共同特征，然后使用双重采样策略和不对称 GCL 损失来优化这些提示。通过只微调整一些提示而固定预训练模型，LPT 可以通过存储少量提示来减少训练和部署成本，并且具有预训练模型的强大泛化能力。实验证明，在各种长尾基准测试中，仅使用约 1.1% 的额外参数，LPT 取得了与以前整体模型微调方法相当的性能，并且对于领域偏移更加稳健。