Large language models provide rich semantic priors and strong reasoning capabilities, making them promising auxiliary signals for recommendation. However, prevailing approaches either deploy LLMs as standalone recommender or apply global knowledge distillation, both of which suffer from inherent drawbacks. Standalone LLM recommender are costly, biased, and unreliable across large regions of the user item space, while global distillation forces the downstream model to imitate LLM predictions even when such guidance is inaccurate. Meanwhile, recent studies show that LLMs excel particularly in re-ranking and challenging scenarios, rather than uniformly across all contexts.We introduce Selective LLM Guided Regularization, a model-agnostic and computation efficient framework that activates LLM based pairwise ranking supervision only when a trainable gating mechanism informing by user history length, item popularity, and model uncertainty predicts the LLM to be reliable. All LLM scoring is performed offline, transferring knowledge without increasing inference cost. Experiments across multiple datasets show that this selective strategy consistently improves overall accuracy and yields substantial gains in cold start and long tail regimes, outperforming global distillation baselines.
翻译:大型语言模型提供丰富的语义先验和强大的推理能力,使其成为推荐系统中具有潜力的辅助信号。然而,当前主流方法要么将LLM部署为独立推荐器,要么采用全局知识蒸馏,这两种方式均存在固有缺陷。独立LLM推荐器成本高昂、存在偏差,且在用户-物品空间的大范围区域内不可靠;而全局蒸馏则强制下游模型模仿LLM的预测结果,即使此类指导并不准确。同时,近期研究表明,LLM在重排序和挑战性场景中表现尤为突出,而非在所有情境中均保持均衡优势。本文提出选择性LLM引导正则化框架——一种与模型无关且计算高效的架构,该框架仅当可训练门控机制(依据用户历史长度、物品流行度和模型不确定性进行判断)预测LLM可靠时,才激活基于LLM的成对排序监督。所有LLM评分均离线执行,在知识迁移的同时不增加推理成本。跨多个数据集的实验表明,这种选择性策略能持续提升整体推荐精度,在冷启动和长尾场景中取得显著效果,其性能优于全局蒸馏基线方法。