In this paper, we study a multi-step interactive recommendation problem, where the item recommended at current step may affect the quality of future recommendations. To address the problem, we develop a novel and effective approach, named CFRL, which seamlessly integrates the ideas of both collaborative filtering (CF) and reinforcement learning (RL). More specifically, we first model the recommender-user interactive recommendation problem as an agent-environment RL task, which is mathematically described by a Markov decision process (MDP). Further, to achieve collaborative recommendations for the entire user community, we propose a novel CF-based MDP by encoding the states of all users into a shared latent vector space. Finally, we propose an effective Q-network learning method to learn the agent's optimal policy based on the CF-based MDP. The capability of CFRL is demonstrated by comparing its performance against a variety of existing methods on real-world datasets.
翻译:在本文中,我们研究一个多步骤互动建议问题,目前建议的项目可能会影响未来建议的质量。为了解决这个问题,我们制定了一种新颖而有效的方法,名为CFRL,它无缝地结合了合作过滤(CFRL)和强化学习(RL)的概念。更具体地说,我们首先将建议用户互动建议问题作为代理-环境(RL)任务的模式,由Markov决定程序(MDP)从数学上加以描述。此外,为了实现整个用户群体的合作建议,我们提出了一个基于CF的新型MDP,将所有用户的状态编码成一个共同的潜在矢量空间。最后,我们提出了一个有效的Q-网络学习方法,学习基于基于CFMDP的代理商最佳政策。CFRL的能力通过将其表现与现实世界数据集上的各种现有方法进行比较而得到证明。