Large Language Models (LLMs) have shown significant potential for improving recommendation systems through their inherent reasoning capabilities and extensive knowledge base. Yet, existing studies predominantly address warm-start scenarios with abundant user-item interaction data, leaving the more challenging cold-start scenarios, where sparse interactions hinder traditional collaborative filtering methods, underexplored. To address this limitation, we propose novel reasoning strategies designed for cold-start item recommendations within the Netflix domain. Our method utilizes the advanced reasoning capabilities of LLMs to effectively infer user preferences, particularly for newly introduced or rarely interacted items. We systematically evaluate supervised fine-tuning, reinforcement learning-based fine-tuning, and hybrid approaches that combine both methods to optimize recommendation performance. Extensive experiments on real-world data demonstrate significant improvements in both methodological efficacy and practical performance in cold-start recommendation contexts. Remarkably, our reasoning-based fine-tuned models outperform Netflix's production ranking model by up to 8% in certain cases.
翻译:大语言模型凭借其固有的推理能力和广泛的知识库,在提升推荐系统性能方面展现出显著潜力。然而,现有研究主要集中于具有丰富用户-物品交互数据的温启动场景,而对更具挑战性的冷启动场景——即稀疏交互数据制约传统协同过滤方法的应用——则探索不足。为应对这一局限,我们针对Netflix领域提出了一系列新颖的冷启动物品推荐推理策略。该方法利用大语言模型的高级推理能力,有效推断用户偏好,尤其适用于新引入或交互极少的物品。我们系统评估了监督微调、基于强化学习的微调以及融合两种方法的混合策略,以优化推荐性能。在真实数据上的大量实验表明,该方法在冷启动推荐场景中,无论是方法有效性还是实际性能均取得显著提升。值得注意的是,在某些情况下,我们基于推理的微调模型在性能上优于Netflix生产级排序模型,提升幅度最高可达8%。