Natural Language Inference (NLI) models frequently rely on spurious correlations rather than semantic reasoning. Existing mitigation strategies often incur high annotation costs or trigger catastrophic forgetting during fine-tuning. We propose an automated, scalable pipeline to address these limitations. First, we introduce Log-Frequency LMI (LF-LMI) to accurately detect semantic artifacts. Second, we generate a high-quality synthetic contrast set via an LLM-synthesis pipeline with multi-judge verification. Finally, we introduce Dynamic Balanced Sampling, a training strategy that rotates the original data distribution to prevent forgetting. Our method improves consistency on a challenging benchmark from 63.5% to 81.0% while maintaining 88.4% in-domain accuracy, significantly outperforming naive fine-tuning.
翻译:自然语言推理(NLI)模型常依赖伪相关性而非语义推理。现有的缓解策略通常标注成本高昂,或在微调过程中引发灾难性遗忘。我们提出了一种自动化、可扩展的流程以应对这些局限。首先,我们引入对数频率互信息(LF-LMI)来精确检测语义伪影。其次,我们通过一个包含多评委验证的大语言模型合成流程生成高质量的合成对比集。最后,我们提出动态平衡采样——一种通过轮换原始数据分布以防止遗忘的训练策略。我们的方法将模型在一个具有挑战性的基准测试上的一致性从63.5%提升至81.0%,同时保持了88.4%的域内准确率,显著优于朴素的微调方法。