基于 GPT 的个性化推荐和用户兴趣解释的生成式框架（GPT4Rec） (GPT4Rec: A Generative Framework for Personalized Recommendation and User Interests Interpretation)

Recent advancements in Natural Language Processing (NLP) have led to the development of NLP-based recommender systems that have shown superior performance. However, current models commonly treat items as mere IDs and adopt discriminative modeling, resulting in limitations of (1) fully leveraging the content information of items and the language modeling capabilities of NLP models; (2) interpreting user interests to improve relevance and diversity; and (3) adapting practical circumstances such as growing item inventories. To address these limitations, we present GPT4Rec, a novel and flexible generative framework inspired by search engines. It first generates hypothetical "search queries" given item titles in a user's history, and then retrieves items for recommendation by searching these queries. The framework overcomes previous limitations by learning both user and item embeddings in the language space. To well-capture user interests with different aspects and granularity for improving relevance and diversity, we propose a multi-query generation technique with beam search. The generated queries naturally serve as interpretable representations of user interests and can be searched to recommend cold-start items. With GPT-2 language model and BM25 search engine, our framework outperforms state-of-the-art methods by $75.7\%$ and $22.2\%$ in Recall@K on two public datasets. Experiments further revealed that multi-query generation with beam search improves both the diversity of retrieved items and the coverage of a user's multi-interests. The adaptiveness and interpretability of generated queries are discussed with qualitative case studies.

翻译：随着自然语言处理（NLP）的不断发展，基于 NLP 的推荐系统已经展现出卓越的性能。然而，当前的模型常常将物品视为仅仅是 ID 并采用判别式建模，导致了三个方面的局限性：（1）无法充分利用物品的内容信息和 NLP 模型的语言模型能力；（2）难以解释用户兴趣以提高相关性和多样性；（3）不能适应实际情况，如不断增长的物品库存。为了解决这些问题，我们提出了 GPT4Rec，这是一种受到搜索引擎启发的新颖灵活的生成式框架。它首先根据用户历史记录中的物品标题生成假想的“搜索查询”，然后通过搜索这些查询来检索物品进行推荐。该框架通过在语言空间中学习用户和物品的嵌入式表示来克服之前的局限性。为了捕捉用户不同方面和粒度的兴趣，提高相关性和多样性，我们提出了一种基于 beam search 的多查询生成技术。生成的查询自然地成为用户兴趣的可解释表示，并且可以被搜索以推荐冷启动物品。使用 GPT-2 语言模型和 BM25 搜索引擎，我们的框架在两个公共数据集上的 Recall@K 指标上优于先进方法 $75.7\%$ 和 $22.2\%$。实验证明，使用 Beam Search 的多查询生成技术可以提高检索物品的多样性和覆盖用户的多个兴趣点。同时，我们通过案例研究讨论了生成查询的适应性和可解释性。